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^ (54) Title: NOVEL HUMAN GENES AND GENE EXPRESSION PRODUCTS 
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Q (57) Abstract: The invention provides novel polynucleotides. The invention further provides novel members of protein families, 
^ and polynucleotides that are differentially expressed in cancer cells relative to normal cells, and in metastatic cancer cells relative to 
^ normal cells or non-metastatic cancer cells. 
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NOVEL HUMAN GENES AND GENE EXPRESSION PRODUCTS 

FIELD OF THE INVENTION 

The present invention relates to novel polynucleotides of human origin 
and the encoded gene products. 

5 BACKGROUND OF THE INVENTION 

Identification of novel polynucleotides* particularly those that encode an 
expressed gene product, is important in the advancement of drug discovery, diagnostic 
technologies, and the understanding of the progression and nature of complex diseases 
such as cancer. Identification of genes expressed in different cell types isolated from 
10 sources that differ in disease state or stage, developmental stage, exposure to various 
environmental factors, the tissue of origin, the species from which the tissue was 
isolated, and the like is key to identifying the genetic factors that are responsible for the 
phenotypes associated with these various differences. 

This invention provides novel human polynucleotides* the polypeptides 
15 encoded by these polynucleotides, and the genes and proteins corresponding to these 
novel polynucleotides. 

SUMMARY OF THE INVENTION 

This invention relates to novel human polynucleotides and variants 
thereof, their encoded polypeptides and variants thereof, to genes corresponding to these 

20 polynucleotides and to proteins expressed by the genes. The invention also relates to 
diagnostics and therapeutics comprising such novel human polynucleotides, their 
corresponding genes or gene products, including probes, antisense nucleotides, and 
antibodies. The polynucleotides of the invention correspond to a polynucleotide 
comprising the sequence information of at least one of SEQ ID NOs: 1-335L 

25 Various aspects and embodiments of the invention will be readily 

apparent to the ordinarily skilled artisan upon reading the description provided herein. 

DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to polynucleotides comprising the disclosed 
nucleotide sequences, to full length cDNA, mRNA genomic sequences, and genes 
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corresponding to these sequences and degenerate variants thereof, and to polypeptides 
encoded by the polynucleotides of the invention and polypeptide variants. 

Polypeptide variants differ from wild type protein in having one or more 
amino acid substitutions that either enhance, add, or diminish a biological activity of the 
5 wild type protein. 

Six of the polypeptides disclosed herein encode new members of the MKK 
kinase family; the coding region is found within the nucleotide region in parentheses: SEQ 
ID NO:29 (nucleotides 295-421); SEQ ID NO:31 (298-397); SEQ ID NO:196 (37-322); 
SEQ ID NO:3 175 (nucleotides 14-164); SEQ ID NO:3190 (229-390); and SEQ ID 

10 NO:3281 (15-182). Twenty-four of the polypeptides encode new members of the family 
of transcription factor proteins having a basic region plus leucine zipper: SEQ ID NO:410 
(42-191); SEQ ID NO:552 (1 16-288); SEQ ID NO:768 (1 16-288); SEQ ID NO:822 (108- 
262); SEQ ID NO:836 (158-353); SEQ ID NO:1288 (73-234); SEQ ID NO:1365 (69-257); 
SEQ ID NO-1540 (289-471); SEQ ID NO:1549 (200-391); SEQ ID NQ: 1 556 (163-354); 

15 SEQ ID N0:I557 (207-398); SEQ ID NO:1563 (107-298); SEQ ID NO:1622 (180-365); 
SEQ ID NO:1630 (100-291); SEQ ID NO:I704 (184-372); SEQ ID NO:1808 (36-161); 
SEQ ID NOM454 (49-209); SEQ ID NO:2363 (48-211); SEQ ID NO:2424 (43-194); 
SEQ ID NO:3147 (190-369); SEQ ID NO:3152 (129-320); SEQ ID NO:3158 (167- 
334); and SEQ ID NO:3208 (34-256). 

20 SEQ ID NOs:186 (175-395); 2591 (60-165); 3307 (43-321); and 3339 

(94-342) encode polypeptides having an SH2 domain, and SEQ ID NOs:234 (23-121), 
1832 (18-173), and 1835 (57-206) encode polypeptides having an SH3 domain. Nine 
polypeptides encode new members of the family of proteins having Ank repeat regions: 
SEQ ID NO: 187 (358-432); SEQ ID NO: 1268 (238-315); SEQ ID NO: 1804 (301-378); 

25 SEQ ID NO:1819 (278-355); SEQ ID NO: 1839 (224-307); SEQ ID NO: 1830 (184-267); 
SEQ ID NO:2562 (18-101); SEQ ID NO:3015 (131-214); and SEQ ID NO:3267 (97- 
180). 

The following eleven polynucleotides encode polypeptides having a C2H2 
type zinc finger: SEQ ID NOs:308 (1 10-172); 807 (339-392); 1324 (294-356); 1503 (154- 

30 216); 1527 (156-212); 1674 (196-258); 1779 (64-126); 1801 (295-351); 3081 (190-252); 
3193 (293-355); and 3306 (161-223). Eight polynucleotides encode polypeptides of the 
family of ATPases: SEQ ID NOs:431 (71-428); 639 (157-561); 2135 (2-401); 2684 (9- 
461); 2859 (100-320); 3178 (45-386); 3197 (281-343) and 3266 (8-139). Polypeptides 
having a fibronectin type III domain are encoded by SEQ ID NO:746 (209-427) and 1 192 

35 (186-416), Polypeptides having an EF-hand domain are encoded by SEQ ID NO:820 (341- 
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406); 1755 (281-367) and 3285(16-102). Six polypeptides of the protein kinase family are 
encoded by SEQ ID NOs:l 157 (41-444); 1478 (54-437), 1496 (241-520); 2286 (12-182); 
2969 (5-387); and 3 1 90 (1 1 8-390). 

L1M domain-containing polypeptides are encoded by SEQ ID NO: 1269 
5 (79-240); 1309 (248-404); 1360 (222-377); and 1386 (243-398). Two polypeptides of the 
family having a C2 domain (protein kinase C-like) are encoded by SEQ ID NO: 1325 (1- 
234) and 2282(183-353). Polypeptides having a WD domain, G-beta repeat motif are 
encoded by SEQ ID NOs:1336 (66-164); 1380 (42-140); 171 1 (263-361); 1762 (236-334); 
1909 (160-258); 2218 (127-225); 3047 (191-292); 3108 (275-367) and 3292 (208^300). 

10 SEQ ID NO:1410 (222-350) encodes a member of the trypsin family. SEQ 

ID NOs:1417 (8-354); 2281 (20-387) and 2310 (20-371) encode members of the protein 
tyrosine phosphatase family. SEQ ID NOs:I464 (4-180) and 1514 (2-252) encode 
members of the family having an RNA recognition motif (also known as RRM, RBD, or 
RNP domain). SEQ ID NOs:1496 (241-520) and 3297(7-153) encode helicases having a 

15 conserved C-terminal domain. SEQ ID NO: 1 538 (9-635) encodes a member of the wnt 
family of developmental signaling proteins. 

Three polynucleotides encode polypeptides having a homeobox domain: 
SEQ ID NOs:1676 (9-86); 1820 (123-299); and 1821 (127-303). A novel thioredoxin is 
encoded by SEQ ID NO: 1677 (316-369). Two novel members of the ras family are 

20 encoded by SEQ ID NO: 1688(1 09-4 10) and 3258(138-394). A novel polypeptide having a 
phosphatidylinositol-specific phospholipase C Y-domain is encoded by SEQ ID NO:1707 
(92-439). A novel serine carboxypeptidase is encoded by SEQ ID NO: 1744 (238-433). A 
novel polypeptide having N-terminal homology in the Ets domain is encoded by SEQ ID 
NO:181 1 (184-315). A novel polypeptide having a bromodomain is encoded by SEQ ID 

25 NO:1814 (127-294). A novel polypeptide having a double-stranded RNA binding motif is 
encoded by SEQ ID NO: 1818 (9-146). A novel polypeptide having a G-protein alpha 
subunit is encoded by SEQ ID NO: 1846 (12-398). 

SEQ ID NOs:I911 (35-151) and 1980 (60-197) encode polypeptides 
having a C3HC4 type zinc finger domain (RING finger). SEQ ID NO:2065 (253-306) 

30 encodes a polypeptide having a CCHC zinc finger domain. SEQ ID NO:22 1 6 (90- 1 79) 
encodes a polypeptide having a WW7rsp5/WWP domain. SEQ ID NO:2428 (25-350) 
encodes a polypeptide member of the dual specificity phosphatase family, having a 
catalytic domain. 

SEQ ID NOs:2577 (0-311); 3183 (14-215); and 3195 (0-215) encode 
35 members of the 4 transmembrane segment integral membrane protein family. SEQ ID 
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NOs:2826 (1 16-400) and 2871 (198-392) encode polypeptides of the DEAD and DEAH 
box helicase family. SEQ ID NO:2944 (18-281) encodes a polypeptide having a 
calpain large subunit, domain III* 

SEQ ID NO:3274 (11-187) encodes a eukaryotic transcription factor 
5 with a fork head domain. SEQ ID NO:3345 (65-271) encodes a polypeptide having a 
PDZ domain, and SEQ ID NO:3351 (124-270) encodes a polypeptide in the family of 
phorbol esters/glycerol binding proteins. 

Described below are polynucleotide compositions encompassed by the 
invention, methods for obtaining cDNA or genomic DNA encoding a full-length gene 

10 product, expression of these polynucleotides and genes, identification of structural motifs 
of the polynucleotides and genes, identification of the function of a gene product encoded 
by a gene corresponding to a polynucleotide of the invention, use of the provided 
polynucleotides as probes and in mapping and in tissue profiling, use of the corresponding 
polypeptides and other gene products to raise antibodies, and use of the polynucleotides 

1 5 and their encoded gene products for therapeutic and diagnostic purposes. 

Polynucleotide Compositions 

The scope of the invention with respect to polynucleotide compositions 
includes, but is not necessarily limited to, polynucleotides having a sequence set forth in 
any one of SEQ ID NOs: 1-3351; polynucleotides obtained from the biological materials 

20 described herein or other biological sources (particularly human sources) by 
hybridization under stringent conditions (particularly conditions of high stringency); 
genes corresponding to the provided polynucleotides; variants of the provided 
polynucleotides and their corresponding genes, particularly those variants that retain a 
biological activity of the encoded gene product (e.g., a biological activity ascribed to a 

25 gene product corresponding to the provided polynucleotides as a result of the 
assignment of the gene product to a protein family(ies) and/or identification of a 
functional domain present in the gene product). Other nucleic acid compositions 
contemplated by and within the scope of the present invention will be readily apparent 
to one of ordinary skill in the art when provided with the disclosure here. 

30 "Polynucleotide" and "nucleic acid" as used herein with reference to nucleic acids of 
the composition is not intended to be limiting as to the length or structure of the nucleic 
acid unless specifically indicated. 

The invention features polynucleotides that are expressed in human 
tissue, specifically human colon, breast, and/or lung tissue. Novel nucleic acid 
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compositions of the invention comprise a sequence set forth in any one of SEQ ID 
NOs: 1-3351 or an identifying sequence thereof An "identifying sequence" is a 
contiguous sequence of residues at least about 10 nt to about 20 nt in length, usually at 
least about 50 nt to about 100 nt in length, that uniquely identifies a polynucleotide 
5 sequence, e.g., exhibits less than 90%, usually less than about 80% to about 85% 
sequence identity to any contiguous nucleotide sequence of more than about 20 nt. 
Thus, the subject novel nucleic acid compositions include full length cDNAs or mRNAs 
that encompass an identifying sequence of contiguous nucleotides from any one of SEQ 
IDNOs:l-3351. 

10 The polynucleotides of the invention also include polynucleotides having 

sequence similarity or sequence identity. Nucleic acids having sequence similarity are 
detected by hybridization under low stringency conditions, for example, at 50°C and 
10XSSC (0.9 M saline/0.09 M sodium citrate) and remain bound when subjected to 
washing at 55°C in 1XSSC. Sequence identity can be determined by hybridization 

15 under stringent conditions, for example, at 50°C or higher and 0.1XSSC (9 mM 
saline/0-9 mM sodium citrate). Hybridization methods and conditions are well known 
in the art, see, e.g., U.S. Patent No. 5*707,829, Nucleic acids that are substantially 
identical to the provided polynucleotide sequences, e,g 7 allelic variants, genetically 
altered versions of the gene, etc., bind to the provided polynucleotide sequences (SEQ 

20 ID NOs: 1-3351) under stringent hybridization conditions. By using probes, particularly 
labeled probes of DNA sequences, one can isolate homologous or related genes. The 
source of homologous genes can be any species, e.g., primate species, particularly 
human; rodents, such as rats and mice; canines, felines, bovines, ovines, equines, yeast, 
nematodes, etc, 

25 Preferably, hybridization is performed using at least 15 contiguous 

nucleotides (nt) of at least one of SEQ ID NOs: 1-3351. That is, when at least 15 
contiguous nt of one of the disclosed SEQ ID NOs. is used as a probe, the probe will 
preferentially hybridize with a nucleic acid comprising the complementary sequence, 
allowing the identification and retrieval of the nucleic acids that uniquely hybridize to 

30 the selected probe. Probes from more than one SEQ ID NO. can hybridize with the 
same nucleic acid if the cDNA from which they were derived corresponds to one 
mRNA. Probes of more than 15 nt can be used, e.g, probes of from about 18 nt to 
about 100 nt, but 15 nt represents sufficient sequence for unique identification. 

The polynucleotides of the invention also include naturally occurring 

35 variants of the nucleotide sequences (e.g., degenerate variants, allelic variants). 
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Variants of the polynucleotides of the invention are identified by hybridization of 
putative variants with nucleotide sequences disclosed herein, preferably by 
hybridization under stringent conditions. For example, by using appropriate wash 
conditions, variants of the polynucleotides of the invention can be identified where the 
5 allelic variant exhibits at most about 25-30% base pair (bp) mismatches relative to the 
selected polynucleotide probe. In general, allelic variants contain 15-25% bp 
mismatches, and can contain as little as even 5-15%, or 2-5%, or 1-2% bp mismatches, 
as well as a single bp mismatch. 

The invention also encompasses homologs corresponding to the 

10 polynucleotides of SEQ ID NOs:l-3351, where the source of homologous genes can be 
any mammalian species, e.g., primate species, particularly human; rodents, such as rats; 
canines, felines, bovines, ovines, equines, yeast, nematodes, etc. Between mammalian 
species, e.g., human and mouse, homologs generally have substantial sequence 
similarity, e.g., at least 75% sequence identity, usually at least 90%, more usually at 

15 least 95% between nucleotide sequences. Sequence similarity is calculated based on a 
reference sequence, which may be a subset of a larger sequence, such as a conserved 
motif, coding region, flanking region, etc. A reference sequence will usually be at least 
about 1 8 contiguous nt long, more usually at least about 30 nt long, and may extend to 
the complete sequence that is being compared. Algorithms for sequence analysis are 

20 known in the art, such as BLAST, described in Altschul et al., J. Mol Biol (1990) 
275:403-10. 

In general, variants of the invention have a sequence identity greater than 
at least about 65%, preferably at least about 75%, more preferably at least about 85%, 
and can be greater than at least about 90%, 91%, 92%, 93%, 94%, 95%, or 96%, most 

25 preferably 97%, 98% or 99%. For the purposes of this invention, a preferred method of 
calculating percent identity is the Smith-Waterman algorithm, using the following. 
Global DNA sequence identity must be greater than 65% as determined by the Smith- 
Waterman homology search algorithm as implemented in MPSRCH program (Oxford 
Molecular) using an affme gap search with the following search parameters: gap open 

30 penalty, 12; and gap extension penalty, 1 . 

The subject nucleic acids can be cDNAs or genomic DNAs, as well as 
fragments thereof, particularly fragments that encode a biologically active gene product 
and/or are useful in the methods disclosed herein (e.g., in diagnosis, as a unique 
identifier of a differentially expressed gene of interest, etc.). The term "cDNA" as used 

35 herein is intended to include all nucleic acids that share the arrangement of sequence 
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elements found in native mature mRNA species, where sequence elements are exons 
and 3' and 5' non-coding regions. Normally mRNA species have contiguous exons, 
with the intervening introns, when present, being removed by nuclear RNA splicing, to 
create a continuous open reading frame encoding a polypeptide of the invention. 
5 A genomic sequence of interest comprises the nucleic acid present 

between the initiation codon and the stop codon, as defined in the listed sequences, 
including all of the introns that are normally present in a native chromosome. It can 
further include the 3' and 5 s untranslated regions found in the mature mRNA. It can 
further include specific transcriptional and translational regulatory sequences, such as 

10 promoters, enhancers, etc., including about 1 kb, but possibly more, of flanking 
genomic DNA at either the 5' and 3' end of the transcribed region. The genomic DNA 
can be isolated as a fragment of 100 kbp or smaller; and substantially free of flanking 
chromosomal sequence. The genomic DNA flanking the coding region, either 3' and 
5', or internal regulatory sequences as sometimes found in introns, contains sequences 

1 5 required for proper tissue, stage-specific, or disease-state specific expression. 

The nucleic acid compositions of the subject invention can encode all or 
a part of the subject polypeptides. Double or single stranded fragments can be obtained 
from the DNA sequence by chemically synthesizing oligonucleotides in accordance 
with conventional methods, by restriction enzyme digestion, by PCR amplification, etc. 

20 Isolated polynucleotides and polynucleotide fragments of the invention comprise at 
least about 10, about 15, about 20, about 35, about 50, about 100, about 150 to about 
200, about 250 to about 300, or about 350 contiguous nt selected from the 
polynucleotide sequences as shown in SEQ ID NOs: 1-3351. The fragments also 
include those of lengths intermediate to the specifically mentioned lengths, such as 35, 

25 36, 37, 38, 39, etc.; 1 50, 1 5 1 , 1 52, 1 53, 1 54, etc. For the most part, fragments will be of 
at least 15 nt, usually at least 18 nt or 25 nt, and up to at least about 50 contiguous nt in 
length or more. In a preferred embodiment, the polynucleotide molecules comprise a 
contiguous sequence of at least 12 nt selected from the group consisting of the 
polynucleotides shown in SEQ ID NOs: 1-3351. 

30 Probes specific to the polynucleotides of the invention can be generated 

using the polynucleotide sequences disclosed in SEQ ID NOs: 1-3351. The probes are 
preferably at least about a 12, 15, 16, 18, 20, 22, 24, or 25 nt fragment of a 
corresponding contiguous sequence of SEQ ID NOs: 1-3351, and can be less than 2, 1, 
0.5, 0.1, or 0.05 kb in length. The probes can be synthesized chemically or can be 

35 generated from longer polynucleotides using restriction enzymes. The probes can be 
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labeled, for example, with a radioactive, biotinylated, or fluorescent tag. Preferably, 
probes are designed based upon an identifying sequence of a polynucleotide of one of 
SEQ ID NOs: 1-3351. More preferably, probes are designed based on a contiguous 
sequence of one of the subject polynucleotides that remain unmasked following 
5 application of a masking program for masking low complexity {e.g., XBLAST) to the 
sequence., le. y one would select an unmasked region, as indicated by the 
polynucleotides outside the poly-n stretches of the masked sequence produced by the 
masking program. 

The polynucleotides of the subject invention are isolated and obtained in 
10 substantial purity, generally as other than an intact chromosome. Usually, the 
polynucleotides, either as DNA or RNA, will be obtained substantially free of other 
naturally-occurring nucleic acid sequences, generally being at least about 50%, usually 
at least about 90% pure and are typically "recombinant", e.g., flanked by one or more 
nucleotides with which it is not normally associated on a naturally occurring 
15 chromosome. 

The polynucleotides of the invention can be provided as a linear 
molecule or within a circular molecule, and can be provided within autonomously 
replicating molecules (vectors) or within molecules without replication sequences. 
Expression of the polynucleotides can be regulated by their own or by other regulatory 

20 sequences known in the art. The polynucleotides of the invention can be introduced 
into suitable host cells using a variety of techniques available in the art, such as 
transferrin polycation-mediated DNA transfer, transfection with naked or encapsulated 
nucleic acids, liposome-mediated DNA transfer, intracellular transportation of DNA- 
coated latex beads, protoplast fusion, viral infection, electroporation, gene gun, calcium 

25 phosphate-mediated transfection, and the like. 

The subject nucleic acid compositions can be used to, for example, 
produce polypeptides, as probes for the detection of mRNA of the invention in 
biological samples {e.g., extracts of human cells) to generate additional copies of the 
polynucleotides, to generate ribozymes or antisense oligonucleotides, and as single 

30 stranded DNA probes or as triple-strand forming oligonucleotides. The probes 
described herein can be used to, for example, determine the presence or absence of the 
polynucleotide sequences as shown in SEQ ID NOs: 1-3351 or variants thereof in a 
sample. These and other uses are described in more detail below. 
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Use of Polynucleot ides to Obtain Full-Length cDNA, Gene, and Promoter Rep ion 

Full-length cDNA molecules comprising the disclosed polynucleotides 
are obtained as follows. A polynucleotide having a sequence of one of SEQ ID NOs:l- 
3351, or a portion thereof comprising at least 12, 15, 18, or 20 nt, is used as a 
5 hybridization probe to detect hybridizing members of a cDNA library using probe 
design methods, cloning methods, and clone selection techniques such as those 
described in U.S. Patent No. 5,654,173. Libraries of cDNA are made from selected 
tissues, such as normal or tumor tissue, or from tissues of a mammal treated with, for 
example, a pharmaceutical agent. Preferably, the tissue is the same as the tissue from 

10 which the polynucleotides of the invention were isolated, as both the polynucleotides 
described herein and the cDNA represent expressed genes. Most preferably, the cDNA 
library is made from the biological material described herein in the Examples. The 
choice of cell type for library construction can be made after the identity of the protein 
encoded by the gene corresponding to the polynucleotide of the invention is known, 

15 This will indicate which tissue and cell types are likely to express the related gene, and 
thus represent a suitable source for the mRNA for generating the cDNA. As described 
in the Examples, cDNA of the invention was isolated from specific cell or tissue types, 
and such cells and tissues are preferable for obtaining related nucleic acids. 

Techniques for producing and probing nucleic acid sequence libraries are 

20 described, for example, in Sambrook et ah, Molecular Cloning: A Laboratory Manual, 
2nd Ed, (1989) Cold Spring Harbor Press, Cold Spring Harbor, NY. The cDNA can be 
prepared by using primers based on sequence from SEQ ID NOs: 1-3351. In one 
embodiment, the cDNA library can be made from only poly-adenylated mRNA, Thus, 
poly-T primers can be used to prepare cDNA from the mRNA. 

25 Members of the library that are larger than the provided polynucleotides, 

and preferably that encompass the complete coding sequence of the native message, are 
obtained. In order to confirm that the entire cDNA has been obtained, RNA protection 
experiments are performed as follows. Hybridization of a full-length cDNA to an 
mRNA will protect the RNA from RNase degradation. If the cDNA is not full length, 

30 then the portions of the mRNA that are not hybridized will be subject to RNase 
degradation. This is assayed, as is known in the art, by changes in electrophoretic 
mobility on polyacrylamide gels, or by detection of released monoribonucleotides. 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed, (1989) Cold 
Spring Harbor Press, Cold Spring Harbor, NY. In order to obtain additional sequences 
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5' to the end of a partial cDNA, 9 RACE (PCR Protocols: A Guide to Methods and 
Applications, (1990) Academic Press, Inc.) can be performed. 

Genomic DNA is isolated using the provided polynucleotides in a 
manner similar to the isolation of fulWength cDNAs* Briefly, the provided 
5 polynucleotides, or portions thereof, are used as probes to libraries of genomic DNA. 
Preferably, the library is obtained from the cell type that was used to generate the 
polynucleotides of the invention, but this is not essential. Most preferably, the genomic 
DNA is obtained from the biological material described herein in the Examples. Such 
libraries can be in vectors suitable for carrying large segments of a genome, such as PI 

10 or YAC, as described in detail in Sambrook et al., 9.4-9.30. In addition, genomic 
sequences can be isolated from human BAC libraries, which are commercially available 
from Research Genetics, Inc., Huntsville, Alabama, USA, for example. In order to 
obtain additional 5' or 3' sequences, chromosome walking is performed, as described in 
Sambrook et al., such that adjacent and overlapping fragments of genomic DNA are 

15 isolated. These are mapped and pieced together, as is known in the art, using restriction 
digestion enzymes and DNA ligase. 

Using the polynucleotide sequences of the invention, corresponding full- 
length genes can be isolated using both classical and PCR methods to construct and 
probe cDNA libraries. Using either method, Northern blots, preferably, are performed 

20 on a number of cell types to determine which cell lines express the gene of interest at 
the highest level. Classical methods of constructing cDNA libraries are taught in 
Sambrook et aL> supra. With these methods, cDNA can be produced from mRNA and 
inserted into viral or expression vectors. Typically, libraries of rnRNA comprising 
poly(A) tails can be produced with poIy(T) primers. Similarly, cDNA libraries can be 

25 produced using the instant sequences as primers. 

PCR methods are used to amplify the members of a cDNA library that 
comprise the desired insert. In this case, the desired insert will contain sequence from 
the full length cDNA that corresponds to the instant polynucleotides. Such PCR 
methods include gene trapping and RACE methods as described in Gruber et al., WO 

30 95/04745 and Gruber et al, U.S. Patent No. 5,500,356. Kits are commercially available 
to perform gene trapping experiments from, for example, Life Technologies, 
Gaithersburg, Maryland, USA. In preferred embodiments of RACE, a common primer 
is designed to anneal to an arbitrary adaptor sequence ligated to cDNA ends (Apte and 
Siebert, Biotechniques (1993) 75:890-893; Edwards et al., Nuc. Acids Res. (1991) 

35 79:5227-5232). When a single gene-specific RACE primer is paired with the common 
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primer, preferential amplification of sequences between the single gene specific primer 
and the common primer occurs. Commercial cDNA pools modified for use in RACE 
are available. 

The promoter region of a gene generally is located 5' to the initiation site 
5 for RNA polymerase II. Hundreds of promoter regions contain the "TATA" box, a 
sequence such as TATTA or TATAA, which is sensitive to mutations. The promoter 
region can be obtained by performing 5* RACE using a primer from the coding region 
of the gene. Alternatively, the cDNA can be used as a probe for the genomic sequence, 
and the region 5* to the coding region is identified by "walking up.'* If the gene is 
10 highly expressed or differentially expressed, the promoter from the gene can be of use 
in a regulatory construct for a heterologous gene. 

Once the full-length cDNA or gene is obtained, DNA encoding variants 
can be prepared by site-directed mutagenesis, described in detail in Sambrook et al., 
15.3-1 5.63. The choice of codon or nucleotide to be replaced can be based on disclosure 
15 herein on optional changes in amino acids to achieve altered protein structure and/or 
function. 

As an alternative method to obtaining DNA or RNA from a biological 
material, nucleic acid comprising nucleotides having the sequence of one or more 
polynucleotides of the invention can be synthesized. Thus, the invention encompasses 

20 nucleic acid molecules ranging in length from 15 nt (corresponding to at least 15 
contiguous nt of one of SEQ ID NOs:l-3351) up to a maximum length suitable for one 
or more biological manipulations, including replication and expression, of the nucleic 
acid molecule. The invention includes but is not limited to (a) nucleic acid having the 
size of a full gene, and comprising at least one of SEQ ID NOs: 1-3351; (b) the nucleic 

25 acid of (a) also comprising at least one additional polynucleotide or gene, operably 
linked to permit expression of a fusion protein; (c) an expression vector comprising (a) 
or (b); (d) a plasmid comprising (a) or (b) ; and (e) a recombinant viral particle 
comprising (a) or (b). Once provided with the polynucleotides disclosed herein, 
construction or preparation of (a) - (e) are well within the skill in the art. 

30 The sequence of a nucleic acid comprising at least 15 contiguous nt of at 

least any one of SEQ ID NOs: 1-3351, preferably the entire sequence of at least any one 
of SEQ ID NOs: 1-3351, is not limited and can be any sequence of A, T, G, and/or C 
(for DNA) and A, U, G, and/or C (for RNA) or modified bases thereof, including 
inosine and pseudouridine. The choice of sequence will depend on the desired function 

35 and can be dictated by coding regions desired, the intron-like regions desired, and the 
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regulatory regions desired. Where the entire sequence of any one of SEQ ID NOs:l- 
3351 is within the nucleic acid, the nucleic acid obtained is referred to herein as a 
polynucleotide comprising the sequence of any one of SEQ ID NOs: 1 -335 1 . 

Expression of Polypeptide Encoded bv Full-Length cDNA or Full-Length Gene 
5 The provided polynucleotides (e.g. , a polynucleotide having a sequence 

of one of SEQ ID NOs: 1-3351), the corresponding cDNA, or the full-length gene is 
used to express a partial or complete gene product. Constructs of polynucleotides 
having sequences of SEQ ID NOs: 1 -335 1 can be generated synthetically. Alternatively, 
single-step assembly of a gene and entire plasmid from large numbers of 

10 oligodeoxyribonucleotides is described by, e.g., Stemmer et al., Gene (Amsterdam) 
(1995) 7<W(7):49-53. In this method, assembly PGR (the synthesis of long DNA 
sequences from large numbers of oligodeoxyribonucleotides (oligos)) is described. The 
method is derived from DNA shuffling (Stemmer, Nature (1994) 370:389-391), and 
does not rely on DNA ligase, but instead relies on DNA polymerase to build 

1 5 increasingly longer DNA fragments during the assembly process. 

Appropriate polynucleotide constructs are purified using standard 
recombinant DNA techniques as described in, for example, Sambrook et al,, Molecular 
Cloning: A Laboratory Manual, 2nd £tf, (1 989) Cold Spring Harbor Press, Cold Spring 
Harbor, NY, and under current regulations described in United States Dept. of HHS, 

20 National Institute of Health (NIH) Guidelines for Recombinant DNA Research. The 
gene product encoded by a polynucleotide of the invention is expressed in any 
expression system, including, for example, bacterial, yeast, insect, amphibian and 
mammalian systems. Vectors, host cells and methods for obtaining expression in same 
are well known in the art. Suitable vectors and host cells are described in U.S. Patent 

25 No. 5,654,173. 

Polynucleotide molecules comprising a polynucleotide sequence 
provided herein are generally propagated by placing the molecule in a vector. Viral and 
non-viral vectors are used, including plasmids. The choice of plasmid will depend on 
the type of cell in which propagation is desired and the purpose of propagation. Certain 
30 vectors are useful for amplifying and making large amounts of the desired DNA 
sequence. Other vectors are suitable for expression in cells in culture. Still other 
vectors are suitable for transfer and expression in cells in a whole animal or person. The 
choice of appropriate vector is well within the skill of the art. Many such vectors are 
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available commercially. Methods for preparation of vectors comprising a desired 
sequence are well known in the art. 

The polynucleotides set forth in SEQ ID NOs: 1-3351 or their 
corresponding full-length polynucleotides are linked to regulatory sequences as 
5 appropriate to obtain the desired expression properties. These can include promoters 
(attached either at the 5* end of the sense strand or at the 3 f end of the antisense strand), 
enhancers, terminators, operators, repressors, and inducers. The promoters can be 
regulated or constitutive. In some situations it may be desirable to use conditionally 
active promoters, such as tissue-specific or developmental stage-specific promoters. 

10 These are linked to the desired nucleotide sequence using the techniques described 
above for linkage to vectors. Any techniques known in the art can be used. 

When any appropriate host cells or organisms are used to replicate . 
and/or express the polynucleotides or nucleic acids of the invention, the resulting 
replicated nucleic acid, RNA, expressed protein or polypeptide, is within the scope of 

1 5 the invention as a product of the host cell or organism. The product is recovered by any 
appropriate means known in the art. 

Once the gene corresponding to a selected polynucleotide is identified, 
its expression can be regulated in the cell to which the gene is native. For example, an 
endogenous gene of a cell can be regulated by an exogenous regulatory sequence as 

20 disclosed in U.S. Patent No. 5,641,670. 

Identification of Functional and Structural Motifs of Novel Genes 

Translations of the nucleotide sequence of the provided polynucleotides, 

cDNAs or full genes can be aligned with individual known sequences. Similarity with 

individual sequences can be used to determine the activity of the polypeptides encoded 
25 by the polynucleotides of the invention. Also, sequences exhibiting similarity with 

more than one individual sequence can exhibit activities that are characteristic of either 

or both individual sequences. 

The full length sequences and fragments of the polynucleotide sequences 

of the nearest neighbors can be used as probes and primers to identify and isolate the 
30 full length sequence corresponding to provided polynucleotides. The nearest neighbors 

can indicate a tissue or cell type to be used to construct a library for the full-length 

sequences corresponding to the provided polynucleotides. 

Typically, a selected polynucleotide is translated in all six frames to 

determine the best alignment with the individual sequences. The sequences disclosed 
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herein in the Sequence Listing are in a 5' to 3* orientation and translation in three 
frames can be sufficient. These amino acid sequences are referred to, generally, as 
query sequences, which will be aligned with the individual sequences. Databases with 
individual sequences are described in "Computer Methods for Macromolecular 
5 Sequence Analysis" Methods in Enzymology (1996) 266, Doolittle, Academic Press, 
Inc., a division of Harcourt Brace & Co., San Diego, California, USA. Databases 
include Genbank, EMBL, and DNA Database of Japan (DDBJ). 

Query and individual sequences can be aligned using the methods and 
computer programs described above, and include BLAST, available over the world 

10 wide web at http://www.ncbi.nlm.nhi.gov/BLAST. Another alignment algorithm is 
Fasta, available in the Genetics Computing Group (GCG) package, Madison, 
Wisconsin, USA, a wholly owned subsidiary of Oxford Molecular Group, Inc. Other 
techniques for alignment are described in Doolittle, supra. Preferably, an alignment 
program that permits gaps in the sequence is utilized to align the sequences. The 

15 Smith- Waterman is one type of algorithm that permits gaps in sequence alignments. 
See Metk Mol Biol (1997) 70: 173-187. Also, the GAP program using the Needleman 
and Wunsch alignment method can be utilized to align sequences. An alternative search 
strategy uses MPSRCH software, which runs on a MASPAR computer. MPSRCII uses 
a Smith- Waterman algorithm to score sequences on a massively parallel computer. 

20 This approach improves ability to identify sequences that are distantly related matches, 
and is especially tolerant of small gaps and nucleotide sequence errors. Amino acid 
sequences encoded by the provided polynucleotides can be used to search both protein 
and DNA databases. 

High Similarity . In general, in alignment results considered to be of high 

25 similarity, the percent of the alignment region length is typically at least about 55% of 
total length query sequence; more typically, at least about 58%; even more typically; at 
least about 60% of the total residue length of the query sequence. Usually, percent 
length of the alignment region can be as much as about 62%; more usually, as much as 
about 64%; even more usually, as much as about 66%. Further, for high similarity, the 

30 region of alignment, typically, exhibits at least about 75% of sequence identity; more 
typically, at least about 78%; even more typically; at least about 80% sequence identity. 
Usually, percent sequence identity can be as much as about 82%; more usually, as much 
as about 84%; even more usually, as much as about 86%. 

The p value is used in conjunction with these methods. If high similarity 

35 is found, the query sequence is considered to have high similarity with a profile 
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sequence when the p value is less than or equal to about 10" 2 ; more usually; less than or 
equal to about 10 3 ; even more usually; less than or equal to about 10" 4 . More typically, 
the p value is no more than about 10* 5 ; more typically; no more than or equal to about 
10 ,0 ; even more typically; no more than or equal to about 10" 15 for the query sequence 
5 to be considered high similarity* 

Similarity Determined by Sequence Identity Alone . Sequence identity 
alone can be used to determine similarity of a query sequence to an individual sequence 
and can indicate the activity of the sequence. Such an alignment, preferably, permits 
gaps to align sequences. Typically, the query sequence is related to the profile sequence 

10 if the sequence identity over the entire query sequence is at least about 15%; more 
typically, at least about 20%; even more typically, at least about 25%; even more 
typically, at least about 50%. Sequence identity alone as a measure of similarity is most 
useful when the query sequence is usually, at least 80 residues in length; more usually, 
90 residues; even more usually* at least 95 amino acid residues in length. More 

1 5 typically, similarity can be concluded based on sequence identity alone when the query 
sequence is preferably 100 residues in length; more preferably, 120 residues in length; 
even more preferably, 150 amino acid residues in length. 

Alignments with Profile and Multiple Aligned Sequences , Translations 
of the provided polynucleotides can be aligned with amino acid profiles that define 

20 either protein families or common motifs. Also, translations of the provided 
polynucleotides can be aligned to multiple sequence alignments (MSA) comprising the 
polypeptide sequences of members of protein families or motifs. Similarity or identity 
with profile sequences or MSAs can be used to determine the activity of the gene 
products (e.g., polypeptides) encoded by the provided polynucleotides or corresponding 

25 cDNA or genes. For example, sequences that show an identity or similarity with a 
chemokine profile or MSA can exhibit chemokine activities. 

Profiles can be designed manually by (1) creating an MSA, which is an 
alignment of the amino acid sequence of members that belong to the family and (2) 
constructing a statistical representation of the alignment. Such methods are described, 

30 for example, in Birney et al., Nucl Acid Res. (1996) 24(14): 2730-2739. MSAs of some 
protein families and motifs are publicly available. MSAs are described also in 
Sonnhammer et al, Proteins (1997) 28: 405-420. A brief description of MSAs is 
reported in Pascarella et al, Prot Eng. (1996) P(3):249~251, Techniques for building 
profiles from MSAs are described in Sonnhammer et al., supra; Birney et al., supra; 
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and "Computer Methods for Macromolecular Sequence Analysis," Methods in 
Enzymology (1996) 266, Doolittle, Academic Press, Inc., San Diego, California, USA. 

Similarity between a query sequence and a protein family or motif can be 
determined by (a) comparing the query sequence against the profile and/or (b) aligning 
5 the query sequence with the members of the family or motif. Typically, a program such 
as Searchwise is used to compare the query sequence to the statistical representation of 
the multiple alignment, also known as a profile (see Birney et al., supra). Other 
techniques to compare the sequence and profile are described in Sonnhammer et al., 
supra and Doolittle, supra. 

10 Next, methods described by Feng et al, J. Mol Evot. (1987) 25:351 and 

Higgins et al., CABIOS (1989) 5:151 can be used align the query sequence with the 
members of a family or motif, also known as a MSA. Sequence alignments can be 
generated using any of a variety of software tools* Examples include PileUp, which 
creates a multiple sequence alignment, and is described in Feng et al., 1 Mol Evol 

15 (1987) 25:351. Another method, GAP, uses the alignment method of Needleman et al., 
J, Mol Biol (1970) 48:443. GAP is best suited for global alignment of sequences. A 
third method, BestFit, functions by inserting gaps to maximize the number of matches 
using the local homology algorithm of Smith et al., Adv. Appl. Math (1981) 2:482. In 
general, the following factors are used to determine if a similarity between a query 

20 sequence and a profile or MSA exists: (1) number of conserved residues found in the 
query sequence, (2) percentage of conserved residues found in the query sequence, (3) 
number of frameshifts, and (4) spacing between conserved residues. 

Some alignment programs that both translate and align sequences can 
make any number of frameshifts when translating the nucleotide sequence to produce 

25 the best alignment. The fewer frameshifts needed to produce an alignment, the stronger 
the similarity or identity between the query and profile or MSAs. For example, a weak 
similarity resulting from no frameshifts can be a better indication of activity or structure 
of a query sequence, than a strong similarity resulting from two frameshifts. Preferably, 
three or fewer frameshifts are found in an alignment; more preferably two or fewer 

30 frameshifts; even more preferably, one or fewer frameshifts; even more preferably, no 
frameshifts are found in an alignment of query and profile or MSAs. 

Conserved residues are those amino acids found at a particular position 
in all or some of the family or motif members. Alternatively, a position is considered 
conserved if only a certain class of amino acids is found in a particular position in all or 
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some of the family members. For example, the N-terminal position can contain a 
positively charged amino acid, such as lysine, arginine, or histidine. 

Typically, a residue of a polypeptide is conserved when a class of amino 
acids or a single amino acid is found at a particular position in at least about 40% of all 
5 class members; more typically, at least about 50%; even more typically, at least about 
60% of the members, Usually, a residue is conserved when a class or single amino acid 
is found in at least about 70% of the members of a family or motif; more usually, at 
least about 80%; even more usually, at least about 90%; even more usually, at least 
about 95%. 

10 A residue is considered conserved when three unrelated amino acids are 

found at a particular position in the some or all of the members; more usually, two 
unrelated amino acids. These residues are conserved when the unrelated amino acids 
are found at particular positions in at least about 40% of all class member; more 
typically, at least about 50%; even more typically, at least about 60% of the members, 

15 Usually, a residue is conserved when a class or single amino acid is found in at least 
about 70% of the members of a family or motif; more usually, at least about 80%; even 
more usually, at least about 90%; even more usually, at least about 95%. 

A query sequence has similarity to a profile or MSA when the query 
sequence comprises at least about 25% of the conserved residues of the profile or MSA; 

20 more usually, at least about 30%; even more usually; at least about 40%. Typically, the 
query sequence has a stronger similarity to a profile sequence or MSA when the query 
sequence comprises at least about 45% of the conserved residues of the profile or MSA; 
more typically, at least about 50%; even more typically; at least about 55%. 

Identification of Secreted and Membrane-Bound Polypeptides 

25 Both secreted and membrane-bound polypeptides of the present 

invention are of particular interest. For example, levels of secreted polypeptides can be 
assayed in body fluids that are convenient, such as blood, plasma, serum, and other 
body fluids such as urine, prostatic fluid and semen. Membrane-bound polypeptides are 
useful for constructing vaccine antigens or inducing an immune response. Such 

30 antigens would comprise all or part of the extracellular region of the membrane-bound 
polypeptides. Because both secreted and membrane-bound polypeptides comprise a 
fragment of contiguous hydrophobic amino acids, hydrophobicity predicting algorithms 
can be used to identify such polypeptides. 
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A signal sequence is usually encoded by both secreted and membrane- 
bound polypeptide genes to direct a polypeptide to the surface of the cell. The signal 
sequence usually comprises a stretch of hydrophobic residues. Such signal sequences 
can fold into helical structures. Membrane-bound polypeptides typically comprise at 
5 least one transmembrane region that possesses a stretch of hydrophobic amino acids that 
can transverse the membrane. Some transmembrane regions also exhibit a helical 
structure. Hydrophobic fragments within a polypeptide can be identified by using 
computer algorithms. Such algorithms include Hopp & Woods, Proc, Natl Acad. Sci 
USA (1981) 75:3824-3828; Kyte & Doolittle, J. Mol Biol (1982) 157: 105-132; and 
10 RAOAR algorithm, Degli Esposti et al., Eur. J. Biochem. (1990) 190\ 207-219. 

Another method of identifying secreted and membrane-bound 
polypeptides is to translate the polynucleotides of the invention in all six frames and 
determine if at least 8 contiguous hydrophobic amino acids are present. Those 
translated polypeptides with at least 8; more typically, 10; even more typically, 12 
15 contiguous hydrophobic amino acids are considered to be either a putative secreted or 
membrane bound polypeptide. Hydrophobic amino acids include alanine, glycine, 
histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, threonine, 
tryptophan, tyrosine, and valine 

Identification of the Function of an Expression Product of a Pull-Length Gene 
20 Ribozymes, antisense constructs, and dominant negative mutants can be 

used to determine function of the expression product of a gene corresponding to a 
polynucleotide provided herein. The phosphoramidite method of oligonucleotide 
synthesis can be used to construct antisense molecules and ribozymes. See Beaucage et 
al., Tet. Lett. (1981) 22:1859 and U.S. Patent No, 4,668,777. Automated devices for 
25 synthesis are available to create oligonucleotides using this chemistry, Examples of 
such devices include Biosearch 8600, Models 392 and 394 by Applied Biosystems, a 
division of Perkin-Elmer Corp., Foster City, California, USA; and Expedite by 
Perceptive Biosystems, Framingham, Massachusetts, USA. Synthetic RNA, phosphate 
analog oligonucleotides, and chemically derivatized oligonucleotides can also be 
30 produced, and can be covalently attached to other molecules. RNA oligonucleotides 
can be synthesized, for example, using RNA phosphoramidites. This method can be 
performed on an automated synthesizer, such as Applied Biosystems, Models 392 and 
394, Foster City, California, USA. 
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Oligonucleotides of up to 200 nt can be synthesized, more typically, 100 
nt, more typically 50 nt; even more typically 30 to 40 nt, These synthetic fragments can 
be annealed and ligated together to construct larger fragments. See, for example, 
Sambrook et ah, supra. Trans-cleaving catalytic RNAs (ribozymes) are RNA 
5 molecules possessing endoribonuclease activity, Ribozymes are specifically designed 
for a particular target, and the target message must contain a specific nucleotide 
sequence. They are engineered to cleave any RNA species site-specifically in the 
background of cellular RNA. The cleavage event renders the mRNA unstable and 
prevents protein expression. Importantly, ribozymes can be used to inhibit expression 

10 of a gene of unknown function for the purpose of determining its function in an in vitro 
or in vivo context, by detecting the phenotypic effect. 

Antisense nucleic acids are designed to specifically bind to RNA, 
resulting in the formation of RNA-DNA or RNA-RNA hybrids, with an arrest of DNA 
replication, reverse transcription or messenger RNA translation. Antisense 

15 polynucleotides based on a selected polynucleotide sequence can interfere with 
expression of the corresponding gene. Antisense polynucleotides are typically 
generated within the cell by expression from antisense constructs that contain the 
antisense strand as the transcribed strand. Antisense polynucleotides based on the 
disclosed polynucleotides will bind and/or interfere with the translation of mRNA 

20 comprising a sequence complementary to the antisense polynucleotide. The expression 
products of control cells and cells treated with the antisense construct are compared to 
detect the protein product of the gene corresponding to the polynucleotide upon which 
the antisense construct is based. The protein is isolated and identified using routine 
biochemical methods. 

25 Given the extensive background literature and clinical experience in 

antisense therapy, one skilled in the art can use selected polynucleotides of the 
invention as additional potential therapeutics. The choice of polynucleotide can be 
narrowed by first testing them for binding to "hot spot" regions of the genome of 
cancerous cells. If a polynucleotide is identified as binding to a "hot spot," testing the 

30 polynucleotide as an antisense compound in the corresponding cancer cells is 
warranted. 

Dominant negative mutations also are readily generated for 
corresponding proteins that are active as homomultimers, A mutant polypeptide will 
interact with wild-type polypeptides (made from the other allele) and form a non- 
35 functional multimer. Thus, a mutation is in a substrate-binding domain, a catalytic 
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domain, or a cellular localization domain. Preferably, the mutant polypeptide will be 
overproduced, Point mutations are made that have such an effect In addition, fusion of 
different polypeptides of various lengths to the terminus of a protein can yield dominant 
negative mutants. General strategies are available for making dominant negative 
5 mutants (see, e.g. , Herskowitz, Nature (1 987) 329:2 1 9), Such techniques can be used to 
create loss of function mutations, which are useful for determining protein function. 

Polypeptides and Variants Thereof 

The polypeptides of the invention include those encoded by the disclosed 
polynucleotides, as well as nucleic acids that, by virtue of the degeneracy of the genetic 

1 0 code, are not identical in sequence to the disclosed polynucleotides. Thus, the invention 
includes within its scope a polypeptide encoded by a polynucleotide having the 
sequence of any one of SEQ ID NOs: 1-3351 or a variant thereof 

In general, the term "polypeptide" as used herein refers to both the full 
length polypeptide encoded by the recited polynucleotide, the polypeptide encoded by 

15 the gene represented by the recited polynucleotide, as well as portions or fragments 
thereof "Polypeptides" also includes variants of the naturally occurring proteins, where 
such variants are homologous or substantially similar to the naturally occurring protein, 
and can be of an origin of the same or different species as the naturally occurring 
protein (e.g., human, murine, or some other species that naturally expresses the recited 

20 polypeptide, usually a mammalian species). In general, variant polypeptides have a 
sequence that has at least about 80%, usually at least about 90%, and more usually at 
least about 98% sequence identity with a differentially expressed polypeptide of the 
invention, as measured by BLAST using the parameters described above. The variant 
polypeptides can be naturally or non-naturally glycosylated, Le., the polypeptide has a 

25 glycosylation pattern that differs from the glycosylation pattern found in the 
corresponding naturally occurring protein. 

The invention also encompasses homologs of the disclosed polypeptides 
(or fragments thereof) where the homologs are isolated from other species, Le>, other 
animal or plant species, where such homologs, usually mammalian species, e.g., 

30 rodents, such as mice, rats; domestic animals, e.g, horse, cow, dog, cat; and humans. 
By "homolog" is meant a polypeptide having at least about 35%, usually at least about 
40% and more usually at least about 60% amino acid sequence identity to a particular 
differentially expressed protein as identified above, where sequence identity is 
determined using the BLAST algorithm, with the parameters described above. 
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In general, the polypeptides of the subject invention are provided in a 
non-naturally occurring environment, e.g., are separated from their naturally occurring 
environment. In certain embodiments, the subject protein is present in a composition 
that is enriched for the protein as compared to a control As such, purified polypeptide 
5 is provided, where by purified is meant that the protein is present in a composition that 
is substantially free of non-differentially expressed polypeptides, where by substantially 
free is meant that less than 90%, usually less than 60% and more usually less than 50% 
of the composition is made up of non-differentially expressed polypeptides. 

Also within the scope of the invention are variants; variants of 
10 polypeptides include mutants, fragments, and fusions. Mutants can include amino acid 
substitutions, additions or deletions. The amino acid substitutions can be conservative 
amino acid substitutions or substitutions to eliminate non-essential amino acids, such as 
to alter a glycosylation site, a phosphorylation site or an acetylation site, or to minimize 
misfolding by substitution or deletion of one or more cysteine residues that are not 

15 necessary for function. Conservative amino acid substitutions are those that preserve 
the general charge, hydrophobicity/ hydrophilicity, and/or steric bulk of the amino acid 
substituted. Variants can be designed so as to retain biological activity of a particular 
region of the protein (e.g., a functional domain and/or, where the polypeptide is a 
member of a protein family, a region associated with a consensus sequence). Selection 

20 of amino acid alterations for production of variants can be based upon the accessibility 
(interior vs. exterior) of the amino acid (see, e.g., Go et al., Int. J. Peptide Protein Res. 
(1980) 75:211), the thermostability of the variant polypeptide (see, e.g., Querol et al., 
Prot. Eng. (1996) 9:265), desired glycosylation sites (see, e.g., Olsen and Thomsen, J. 
Gen. Microbiol. (1991) 737:579), desired disulfide bridges (see, e.g., Clarke et al., 

25 Biochemistry (1993) J2:4322; and Wakarchuk et aL, Protein Eng. (1994) 7:1379), 
desired metal binding sites (see, e.g., Toma et al., Biochemistry (1991) 50:97, and 
Haezerbrouck et al., Protein Eng. (1993) 5:643), and desired substitutions with in 
proline loops (see, e.g y Masul et al., AppL Env. Microbiol. (1994) 60:3579). Cysteine- 
depleted muteins can be produced as disclosed in U.S. Patent No. 4,959,314. 

30 Variants also include fragments of the polypeptides disclosed herein, 

particularly biologically active fragments and/or fragments corresponding to functional 
domains. Fragments of interest will typically be at least about 10 aa to at least about 15 
aa in length, usually at least about 50 aa in length, and can be as long as 300 aa in length 
or longer, but will usually not exceed about 1000 aa in length, where the fragment will 

35 have a stretch of amino acids that is identical to a polypeptide encoded by a 
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polynucleotide having a sequence of any SEQ ID NOs: 1-3351, or a homolog thereof 
The protein variants described herein are encoded by polynucleotides that are within the 
scope of the invention. The genetic code can be used to select the appropriate codons to 
construct the corresponding variants. 

5 Computer-Related Embodiments 

In general, a library of polynucleotides is a collection of sequence 
information, which information is provided in either biochemical form (e.g., as a 
collection of polynucleotide molecules), or in electronic form (e.g., as a collection of 
polynucleotide sequences stored in a computer-readable form, as in a computer system 

10 and/or as part of a computer program). The sequence information of the 
polynucleotides can be used in a variety of ways, e.g., as a resource for gene discovery, 
as a representation of sequences expressed in a selected cell type (e.g., cell type 
markers), and/or as markers of a given disease or disease state. In general, a disease 
marker is a representation of a gene product that is present in all cells affected by 

15 disease either at an increased or decreased level relative to a normal cell (e.g., a cell of 
the same or similar type that is not substantially affected by disease). For example, a 
polynucleotide sequence in a library can be a polynucleotide that represents an mRNA, 
polypeptide, or other gene product encoded by the polynucleotide, that is either 
overexpressed or underexpresscd in a breast ductal cell affected by cancer relative to a 

20 normal (/.e. , substantially disease-free) breast cell. 

The nucleotide sequence information of the library can be embodied in 
any suitable form, e.g., electronic or biochemical forms. For example, a library of 
sequence information embodied in electronic form comprises an accessible computer 
data file (or, in biochemical form, a collection of nucleic acid molecules) that contains 

25 the representative nucleotide sequences of genes that are differentially expressed (e.g., 
overexpressed or underexpressed) as between, for example, i) a cancerous cell and a 
normal cell; ii) a cancerous cell and a dysplastic cell; iii) a cancerous cell and a cell 
afFected by a disease or condition other than cancer; iv) a metastatic cancerous cell and 
a normal cell and/or non-metastatic cancerous cell; v) a malignant cancerous cell and a 

30 non-malignant cancerous cell (or a normal cell) and/or vi) a dysplastic cell relative to a 
normal cell. Other combinations and comparisons of cells afFected by various diseases 
or stages of disease will be readily apparent to the ordinarily skilled artisan. 
Biochemical embodiments of the library include a collection of nucleic acids that have 
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the sequences of the genes in the library, where the nucleic acids can correspond to the 
entire gene in the library or to a fragment thereof as described in greater detail below. 



sequence information of a plurality of polynucleotide sequences, where at least one of 
5 the polynucleotides has a sequence of any of SEQ ID NOs: 1-3351. By plurality is 
meant at least 2, usually at least 3 and can include up to all of SEQ ID NOs: 1-3351. 
The length and number of polynucleotides in the library will vary with the nature of the 
library, e.g., if the library is an oligonucleotide array, a cDNA array, a computer 
database of the sequence information, etc, 

10 Where the library is an electronic library, the nucleic acid sequence 

information can be present in a variety of media. "Media" refers to a manufacture, 
other than an isolated nucleic acid molecule, that contains the sequence information of 
the present invention. Such a manufacture provides the genome sequence or a subset 
thereof in a form that can be examined by means not directly applicable to the sequence 

15 as it exists in a nucleic acid. For example, the nucleotide sequence of the present 
invention, e.g., the nucleic acid sequences of any of the polynucleotides of SEQ ID 
NOs: 1-3351, can be recorded on computer readable media, e.g., any medium that can be 
read and accessed directly by a computer. Such media include, but are not limited to: 
magnetic storage media, such as a floppy disc, a hard disc storage medium, and a 

20 magnetic tape; optical storage media such as CD-ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories such as magnetic/optical storage 
media. One of skill in the art can readily appreciate how any of the presently known 
computer readable mediums can be used to create a manufacture comprising a recording 
of the present sequence information. "Recorded" refers to a process for storing 

25 information on computer readable medium, using any such methods as known in the art. 
Any convenient data storage structure can be chosen, based on the means used to access 
the stored information. A variety of data processor programs and formats can be used 
for storage, e.g., word processing text file, database format, etc. In addition to the 
sequence information, electronic versions of the libraries of the invention can be 

30 provided in conjunction or connection with other computer-readable information and/or 
other types of computer-readable files (e.g., searchable files, executable files, etc., 
including, but not limited to, for example, search program software, etc.). 



information can be accessed for a variety of purposes. Computer software to access 
35 sequence information is publicly available. For example, the BLAST (Altschul et al., 



The polynucleotide libraries of the subject invention generally comprise 
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supra.) and BLAZE (Bmtlag et al. Comp, Chem. (1993) 77:203) search algorithms on a 
Sybase system can be used to identify open reading frames (ORFs) within the genome 
that contain homology to ORFs from other organisms. 

As used herein, "a computer-based system" refers to the hardware 
5 means, software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware of the computer-based 
systems of the present invention comprises a central processing unit (GPU), input 
means, output means, and data storage means. A skilled artisan can readily appreciate 
that any one of the currently available computer-based system are suitable for use in the 

10 present invention. The data storage means can comprise any manufacture comprising a 
recording of the present sequence information as described above, or a memory access 
means that can access such a manufacture. 

"Search means" refers to one or more programs implemented on the 
computer-based system, to compare a target sequence or target structural motif, or 

15 expression levels of a polynucleotide in a sample, with the stored sequence information. 
Search means can be used to identify fragments or regions of the genome that match a 
particular target sequence or target motif A variety of known algorithms are publicly 
known and commercially available, e.g., MacPattern (EMBL), BLASTN and BLASTX 
(NCBI). A "target sequence" can be any polynucleotide or amino acid sequence of six 

20 or more contiguous nucleotides or two or more amino acids, preferably from about 10 
to 100 amino acids or from about 30 to 300 nt, A variety of comparing means can be 
used to accomplish comparison of sequence information from a sample (e.g., to analyze 
target sequences, target motifs, or relative expression levels) with the data storage 
means. A skilled artisan can readily recognize that any one of the publicly available 

25 homology search programs can be used as the search means for the computer based 
systems of the present invention to accomplish comparison of target sequences and 
motifs, Computer programs to analyze expression levels in a sample and in controls are 
also known in the art. 

A "target structural motif" or "target motif," refers to any rationally 

30 selected sequence or combination of sequences in which the sequence(s) are chosen 
based on a three-dimensional configuration that is formed upon the folding of the target 
motif, or on consensus sequences of regulatory or active sites. There are a variety of 
target motifs known in the art. Protein target motifs include, but arc not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are 



WO 01/02568 



PCTYUS00/18374 



not limited to, hairpin structures, promoter sequences and other expression elements 
such as binding sites for transcription factors. 

A variety of structural formats for the input and output means can be 
used to input and output the information in the computer-based systems of the present 
5 invention. One format for an output means ranks the relative expression levels of 
different polynucleotides. Such presentation provides a skilled artisan with a ranking of 
relative expression levels to determine a gene expression profile. 

As discussed above, the "library" of the invention also encompasses 
biochemical libraries of the polynucleotides of SEQ ID NOs: 1-3351, e.g., collections of 

10 nucleic acids representing the provided polynucleotides. The biochemical libraries can 
take a variety of forms, e.g., a solution of cDNAs, a pattern of probe nucleic acids stably 
associated with a surface of a solid support (/.<?., an array) and the like. Of particular 
interest are nucleic acid arrays in which one or more of SEQ ID NOs: 1-3351 is 
represented on the array. By array is meant an article of manufacture that has at least a 

15 substrate with at least two distinct nucleic acid targets on one of its surfaces, where the 
number of distinct nucleic acids can be considerably higher, typically being at least 10 
nt, usually at least 20 nt and often at least 25 nt. A variety of different array formats 
have been developed and are known to those of skill in the art. The arrays of the subject 
invention find use in a variety of applications, including gene expression analysis, drug 

20 screening, mutation analysis and the like, as disclosed in the above-listed exemplary 
patent documents. 

In addition to the above nucleic acid libraries, analogous libraries of 
polypeptides are also provided, where the where the polypeptides of the library will 
represent at least a portion of the polypeptides encoded by SEQ ID NOs: 1-335 1 . 

25 Use of Polynucleotide Probes in Mapping, and in Tissue Profiling 

Polynucleotide probes, generally comprising at least 12 contiguous nt of 
a polynucleotide as shown in the Sequence Listing, are used for a variety of purposes, 
such as chromosome mapping of the polynucleotide and detection of transcription 
levels. Additional disclosure about preferred regions of the disclosed polynucleotide 

30 sequences is found in the Examples. A probe that hybridizes specifically to a 
polynucleotide disclosed herein should provide a detection signal at least 5-, 10-, or 20- 
fold higher than the background hybridization provided with other unrelated sequences. 

Detection of Expression Levels . Nucleotide probes are used to detect 
expression of a gene corresponding to the provided polynucleotide. In Northern blots, 

•Vo 
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mRNA is separated electrophoretically and contacted with a probe, A probe is detected 
as hybridizing to an mRNA species of a particular size. The amount of hybridization is 
quantitated to determine relative amounts of expression, for example under a particular 
condition. Probes are used for in situ hybridization to cells to detect expression. Probes 
5 can also be used in vivo for diagnostic detection of hybridizing sequences. Probes are 
typically labeled with a radioactive isotope. Other types of detectable labels can be 
used such as chromophores, fluors, and enzymes. Other examples of nucleotide 
hybridization assays are described in WO92/02526 and U.S. Patent No. 5,124,246. 

Alternatively, the Polymerase Chain Reaction (PCR) is another means 

10 for detecting small amounts of target nucleic acids (see, e.g., Mullis et ah, Meth. 
Enzymol (1987) 755:335; U.S. Patent No. 4,683,195; and U.S. Patent No. 4,683,202). 
Two primer polynucleotides nucleotides that hybridize with the target nucleic acids are 
used to prime the reaction. The primers can be composed of sequence within or 3' and 
5' to the polynucleotides of the Sequence Listing. Alternatively, if the primers are 3* and 

15 5* to these polynucleotides, they need not hybridize to them or the complements. After 
amplification of the target with a thermostable polymerase, the amplified target nucleic 
acids can be detected by methods known in the art, e.g., Southern blot. mRNA or 
cDNA can also be detected by traditional blotting techniques (e.g., Southern blot, 
Northern blot, etc.) described in Sambrook et al, "Molecular Cloning: A Laboratory 

20 Manual" (New York, Cold Spring Harbor Laboratory, 1989) (e.g., without PCR 
amplification). In general, mRNA or cDNA generated from mRNA using a polymerase 
enzyme can be purified and separated using gel electrophoresis, and transferred to a 
solid support, such as nitrocellulose. The solid support is exposed to a labeled probe, 
washed to remove any unhybridized probe, and duplexes containing the labeled probe 

25 are detected. 

Ma pping . Polynucleotides of the present invention can be used to 
identify a chromosome on which the corresponding gene resides. Such mapping can be 
useful in identifying the function of the polynucleotide-related gene by its proximity to 
other genes with known function. Function can also be assigned to the polynucleotide- 

30 related gene when particular syndromes or diseases map to the same chromosome. For 
example, use of polynucleotide probes in identification and quantification of nucleic 
acid sequence aberrations is described in U.S. Patent No. 5,783,387. An exemplary 
mapping method is fluorescence in situ hybridization (FISH), which facilitates 
comparative genomic hybridization to allow total genome assessment of changes in 

35 relative copy number of DNA sequences (see, e.g., Valdes et al., Methods in Molecular 
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Biology (1997) 68:1). Polynucleotides can also be mapped to particular chromosomes 
using, for example, radiation hybrids or chromosome-specific hybrid panels. See Leach 
et al., Advances in Genetics, (1995) 35:63-99; Walter et al, Nature Genetics (1994) 
7:22; Walter and Goodfellow, Trends in Genetics (1992) P:352. Panels for radiation 
5 hybrid mapping are available from Research Genetics, Inc., Huntsville, Alabama, USA. 
The statistical program RHMAP can be used to construct a map based on the data from 
radiation hybridization with a measure of the relative likelihood of one order versus 
another. RHMAP is available via the world wide web at http://www.sph.umich.edu- 
/group/statgen/software. In addition, commercial programs are available for identifying 

1 0 regions of chromosomes commonly associated with disease, such as cancer. 

Tissue Typ ing or Profiling . Expression of specific mRNA 
corresponding to the provided polynucleotides can vary in different cell types and can 
be tissue-specific, This variation of mRNA levels in different cell types can be 
exploited with nucleic acid probe assays to determine tissue types. For example, PCR, 

15 branched DNA probe assays, or blotting techniques utilizing nucleic acid probes 
substantially identical or complementary to polynucleotides listed in the Sequence 
Listing can determine the presence or absence of the corresponding cDNA or mRNA. 

Tissue typing can be used to identify the developmental organ or tissue 
source of a metastatic lesion by identifying the expression of a particular marker of that 

20 organ or tissue. If a polynucleotide is expressed only in a specific tissue type, and a 
metastatic lesion is found to express that polynucleotide, then the developmental source 
of the lesion has been identified. Expression of a particular polynucleotide can be 
assayed by detection of either the corresponding mRNA or the protein product. 

Use of Poly morphisms . A polynucleotide of the invention can be used in 

25 forensics, genetic analysis, mapping, and diagnostic applications where the 
corresponding region of a gene is polymorphic in the human population. Any means for 
detecting a polymorphism in a gene can be used, including, but not limited to 
electrophoresis of protein polymorphic variants, differential sensitivity to restriction 
enzyme cleavage, and hybridization to allele-specific probes. 

30 Antibody Production 

Expression products of a polynucleotide of the invention, as well as the 
corresponding mRNA, cDNA, or complete gene, can be prepared and used for raising 
antibodies for experimental, diagnostic, and therapeutic purposes. For polynucleotides 
to which a corresponding gene has not been assigned, this provides an additional 

^1 
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method of identifying the corresponding gene. The polynucleotide or related cDNA is 
expressed as described above, and antibodies are prepared. These antibodies are 
specific to an epitope on the polypeptide encoded by the polynucleotide, and can 
precipitate or bind to the corresponding native protein in a cell or tissue preparation or 
5 in a cell-free extract of an in vitro expression system. 

Methods for production of monoclonal and polyclonal antibodies that 
specifically bind a selected antigen are well known in the art. The antibodies 
specifically bind to epitopes present in the polypeptides encoded by polynucleotides 
disclosed in the Sequence Listing. Typically, at least 6, 8, 10, or 12 contiguous amino 

10 acids are required to form an epitope. Epitopes that involve non-contiguous amino 
acids may require a longer polypeptide, e.g., at least 15, 25, or 50 amino acids. 
Antibodies that specifically bind to human polypeptides encoded by the provided 
polynucleotides should provide a detection signal at least 5-, 10-, or 20-fold higher than 
a detection signal provided with other proteins when used in Western blots or other 

15 immunochemical assays. Preferably, antibodies that specifically polypeptides of the 
invention do not bind to other proteins in immunochemical assays at detectable levels 
and can immunoprecipitate the specific polypeptide from solution. 

The invention also contemplates naturally occurring antibodies specific 
for a polypeptide of the invention, For example, serum antibodies to a polypeptide of 

20 the invention in a human population can be purified by methods well known in the art, 
e.g., by passing antiserum over a column to which the corresponding selected 
polypeptide or fusion protein is bound. The bound antibodies can then be eluted from 
the column, for example using a buffer with a high salt concentration. 

In addition to the antibodies discussed above, the invention also 

25 contemplates genetically engineered antibodies, antibody derivatives (e.g., single chain 
antibodies, antibody fragments (e.g., Fab, etc.)), according to methods well known in 
the art. 

Other embodiments of the present invention include humanized 
monoclonal antibodies capable of binding to the polypeptides of the invention. The 

30 phrase "humanized antibody" refers to an antibody derived from a non-human antibody 
- typically a mouse monoclonal antibody. Alternatively, a humanized antibody may be 
derived from a chimeric antibody that retains or substantially retains the antigen- 
binding properties of the parental, non-human, antibody but which exhibits diminished 
immunogenicity as compared to the parental antibody when administered to humans. 

35 The phrase "chimeric antibody," as used herein, refers to an antibody containing 



WO 01/02568 



PCT/US00/18374 



sequence derived from two different antibodies (see, e*g, U.S. Patent No. 4,816,567) 
which typically originate from different species. Most typically, chimeric antibodies 
comprise human and murine antibody fragments, generally human constant and mouse 
variable regions. 

5 Because humanized antibodies are far less immunogenic in humans than 

the parental mouse monoclonal antibodies, they can be used for the treatment of humans 
with far less risk of anaphylaxis. Thus, these antibodies may be preferred in therapeutic 
applications that involve in vivo administration to a human such as, e.g., use as radiation 
sensitizers for the treatment of neoplastic disease or use in methods to reduce the side 
1 0 effects of, e.g., cancer therapy. 

Humanized antibodies may be achieved by a variety of methods 
including, for example: (1) grafting the non-human complementarity determining 
regions (CDRs) onto a human framework and constant region (a process referred to in 
the art as "humanizing"), or, alternatively, (2) transplanting the entire non-human 
15 variable domains, but "cloaking" them with a human-like surface by replacement of 
surface residues (a process referred to in the art as 'Veneering"), In the present 
invention, humanized antibodies will include both "humanized" and "veneered" 
antibodies. These methods are disclosed in, eg,, Jones et al., Nature J27:522-525 
(1986); Morrison et at, Proc. Natl Acad Set, U.SLA., £7:6851-6855 (1984); Morrison 
20 and Oi, Adv. Immunol., 44:65-92 (1988); Verhoeyer et al., Science 23P:1534-1536 
(1988); Padlan, Molec. Immun. 25:489-498 (1991); Padlan, Molec. Immunol 3I(3):\69- 
217 (1994); and Kettleborough, C.A. et al.. Protein Eng. 4(7>:773-83 (1991) each of 
which is incorporated herein by reference. 

The phrase "complementarity determining region" refers to amino acid 
25 sequences which together define the binding affinity and specificity of the natural Fv 
region of a native immunoglobulin binding site. See, e.g t Chothia et al.* J. Mol Biol 
796:901-917 (1987); Kabat et al„ U.S. Dept. of Health and Human Services NIH 
Publication No. 91-3242 (1991). The phrase "constant region" refers to the portion of 
the antibody molecule that confers effector functions. In the present invention, mouse 
30 constant regions are substituted by human constant regions. The constant regions of the 
subject humanized antibodies are derived from human immunoglobulins. The heavy 
chain constant region can be selected from any of the five isotypes: alpha, delta, 
epsilon, gamma or mu. 

One method of humanizing antibodies comprises aligning the non- 
35 human heavy and light chain sequences to human heavy and light chain sequences, 
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selecting and replacing the non-human framework with a human framework based on 
such alignment, molecular modeling to predict the conformation of the humanized 
sequence and comparing to the conformation of the parent antibody. This process is 
followed by repeated back mutation of residues in the CDR region which disturb the 
5 structure of the CDRs until the predicted conformation of the humanized sequence 
model closely approximates the conformation of the non-human CDRs of the parent 
non-human antibody. Such humanized antibodies may be further derivatized to 
facilitate uptake and clearance, e.g., via Ashwell receptors. See, e.g., U.S. Patent Nos. 
5,530,101 and 5,585,089 which patents are incorporated herein by reference, 
10 Humanized antibodies can also be produced using transgenic animals 

that are engineered to contain human immunoglobulin loci. For example, WO 
98/24893 discloses transgenic animals having a human Ig locus wherein the animals do 
not produce functional endogenous immunoglobulins due to the inactivation of 
endogenous heavy and light chain loci. WO 91/10741 also discloses transgenic non- 
15 primate mammalian hosts capable of mounting an immune response to an immunogen, 
wherein the antibodies have primate constant and/or variable regions, and wherein the 
endogenous immunoglobulin-encoding loci are substituted or inactivated. WO 
96/30498 discloses the use of the Cre/Lox system to modify the immunoglobulin locus 
in a mammal, such as to replace all or a portion of the constant or variable region to 
20 form a modified antibody molecule, WO 94/02602 discloses non-human mammalian 
hosts having inactivated endogenous Ig loci and functional human Ig loci. U.S. Patent 
No. 5,939,598 discloses methods of making transgenic mice in which the mice lack 
endogenous heavy claims, and express an exogenous immunoglobulin locus comprising 
one or more xenogeneic constant regions. 
25 Using a transgenic animal described above, an immune response can be 

produced to a selected antigenic molecule, and antibody-producing cells can be 
removed from the animal and used to produce hybridomas that secrete human 
monoclonal antibodies. Immunization protocols, adjuvants, and the like are known in 
the art, and are used in immunization of, for example, a transgenic mouse as described 
30 in WO 96/33735. This publication discloses monoclonal antibodies against a variety of 
antigenic molecules including IL-6, IL-8, TNF , human CD4, L-selectin, gp39, and 
tetanus toxin. The monoclonal antibodies can be tested for the ability to inhibit or 
neutralize the biological activity or physiological effect of the corresponding protein. 
WO 96/33735 discloses that monoclonal antibodies against IL-8, derived from immune 
35 cells of transgenic mice immunized with IL-8, blocked IL-8-induced functions of 
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neutrophils. Human monoclonal antibodies with specificity for the antigen used to 
immunize transgenic animals are also disclosed in WO 96/34096. 

Polynucleotides or Arrays for Diagnostics 

5 Polynucleotide arrays are created by spotting polynucleotide probes onto 

a substrate (e.g., glass, nitrocellose, etc) in a two-dimensional matrix or array having 
bound probes, The probes can be bound to the substrate by either covalent bonds or by 
non-specific interactions, such as hydrophobic interactions. Samples of polynucleotides 
can be detectably labeled (e.g., using radioactive or fluorescent labels) and then 

10 hybridized to the probes. Double stranded polynucleotides, comprising the labeled 
sample polynucleotides bound to probe polynucleotides, can be detected once the 
unbound portion of the sample is washed away. Techniques for constructing arrays and 
methods of using these arrays arc described in EP 799 897; WO 97/29212; WO 
97/27317; EP 785 280; WO 97/02357; U.S. Patent No. 5,593,839; U.S. Patent No, 

15 5,578,832; EP 728 520; U.S. Patent No. 5,599,695; EP 721 016; U.S. Patent No. 
5,556,752; WO 95/22058; and U.S. Patent No. 5,631,734. Arrays can be used to, for 
example, examine differential expression of genes and can be used to determine gene 
function. For example, arrays can be used to detect differential expression of a 
polynucleotide between a test cell and control cell (e.g., cancer cells and normal cells). 

20 For example, high expression of a particular message in a cancer cell, which is not 
observed in a corresponding normal cell, can indicate a cancer specific gene product. 
Exemplary uses of arrays are further described in, for example, Pappalarado et al., Sent. 
Radiation Oncol (1998) 5:217; and Ramsay, Nature BiotechnoL (1998) 75:40, 

Differential Expression in Diagnosis 

25 The polynucleotides of the invention can also be used to detect 

differences in expression levels between two cells, e.g., as a method to identify 
abnormal or diseased tissue in a human. For polynucleotides corresponding to profiles 
of protein families, the choice of tissue can be selected according to the putative 
biological function. In general, the expression of a gene corresponding to a specific 

30 polynucleotide is compared between a first tissue that is suspected of being diseased 
and a second, normal tissue of the human* The tissue suspected of being abnormal or 
diseased can be derived from a different tissue type of the human, but preferably it is 
derived from the same tissue type; for example an intestinal polyp or other abnormal 
growth should be compared with normal intestinal tissue. The normal tissue can be the 

21 
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same tissue as that of the test sample, or any normal tissue of the patient, especially 
those that express the polynucleotide-related gene of interest (e.g., brain, thymus, testis, 
heart, prostate, placenta, spleen, small intestine, skeletal muscle, pancreas, and the 
mucosal lining of the colon). A difference between the polynucleotide-related gene, 
5 mRNA, or protein in the two tissues which are compared, for example in molecular 
weight, amino acid or nucleotide sequence, or relative abundance, indicates a change in 
the gene, or a gene which regulates it, in the tissue of the human that was suspected of 
being diseased. Examples of detection of differential expression and its use in diagnosis 
of cancer are described in U.S. Patent Nos, 5,688,641 and 5,677,125. 

10 A genetic predisposition to disease in a human can also be detected by 

comparing expression levels of an mRNA or protein corresponding to a polynucleotide 
of the invention in a fetal tissue with levels associated in normal fetal tissue. Fetal 
tissues that are used for this purpose include, but are not limited to, amniotic fluid, 
chorionic villi, blood, and the blastomere of an in v/fro-fertilized embryo. The 

1 5 comparable normal polynucleotide-related gene is obtained from any tissue. The mRNA 
or protein is obtained from a normal tissue of a human in which the polynucleotide- 
related gene is expressed. Differences such as alterations in the nucleotide sequence or 
size of the same product of the fetal polynucleotide-related gene or mRNA, or 
alterations in the molecular weight, amino acid sequence, or relative abundance of fetal 

20 protein, can indicate a germlinc mutation in the polynucleotide-related gene of the fetus, 
which indicates a genetic predisposition to disease. In general, diagnostic, prognostic, 
and other methods of the invention based on differential expression involve detection of 
a level or amount of a gene product, particularly a differentially expressed gene product, 
in a test sample obtained from a patient suspected of having or being susceptible to a 

25 disease (e.g., breast cancer, lung cancer, colon cancer and/or metastatic forms thereof), 
and comparing the detected levels to those levels found in normal cells (e.g., cells 
substantially unaffected by cancer) and/or other control cells (e.g., to differentiate a 
cancerous cell from a cell affected by dysplasia). Furthermore, the severity of the 
disease can be assessed by comparing the detected levels of a differentially expressed 

30 gene product with those levels detected in samples representing the levels of 
differentially gene product associated with varying degrees of severity of disease. It 
should be noted that use of the term "diagnostic" herein is not necessarily meant to 
exclude "prognostic" or "prognosis," but rather is used as a matter of convenience. 

The term "differentially expressed gene" is generally intended to 

35 encompass a polynucleotide that can, for example, include an open reading frame 
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encoding a gene product (e.g., a polypeptide), and/or introns of such genes and adjacent 
5' and 3* non-coding nucleotide sequences involved in the regulation of expression, up 
to about 20 kb beyond the coding region, but possibly further in either direction. The 
gene can be introduced into an appropriate vector for extrachromosomal maintenance or 
5 for integration into a host genome. In general, a difference in expression level 
associated with a decrease in expression level of at least about 25%, usually at least 
about 50% to 75%, more usually at least about 90% or more is indicative of a 
differentially expressed gene of interest, i.e., a gene that is underexpressed or down- 
regulated in the test sample relative to a control sample. Furthermore, a difference in 

1 0 expression level associated with an increase in expression of at least about 25%, usually 
at least about 50% to 75%, more usually at least about 90% and can be at least about 
1 /2-fold, usually at least about 2-fold to about 10-fold, and can be about 100-fold to 
about 1,000-fold increase relative to a control sample is indicative of a differentially 
expressed gene of interest, i.e., an overexpressed or up-regulated gene. 

1 5 "Differentially expressed polynucleotide" as used herein means a nucleic 

acid molecule (RNA or DNA) comprising a sequence that represents a differentially 
expressed gene, e.g., the differentially expressed polynucleotide comprises a sequence 
(e.g., an open reading frame encoding a gene product) that uniquely identifies a 
differentially expressed gene so that detection of the differentially expressed 

20 polynucleotide in a sample is correlated with the presence of a differentially expressed 
gene in a sample. "Differentially expressed polynucleotides" is also meant to 
encompass fragments of the disclosed polynucleotides, e.g., fragments retaining 
biological activity, as well as nucleic acids homologous, substantially similar, or 
substantially identical (e.g, having about 90% sequence identity) to the disclosed 

25 polynucleotides. 

"Diagnosis" as used herein generally includes determination of a 
subject's susceptibility to a disease or disorder, determination as to whether a subject is 
presently affected by a disease or disorder, as well as to the prognosis of a subject 
affected by a disease or disorder (e:g, identification of pre-metastatic or metastatic 

30 cancerous states, stages of cancer, or responsiveness of cancer to therapy). The present 
invention particularly encompasses diagnosis of subjects in the context of breast cancer 
(e.g., carcinoma in situ (e.g., ductal carcinoma in situ), estrogen receptor (ER)-positive 
breast cancer, ER-negative breast cancer, or other forms and/or stages of breast cancer), 
lung cancer (e.g., small cell carcinoma, non-small cell carcinoma, mesothelioma, and 

Q>3 
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other forms and/or stages of lung cancer), and colon cancer (e.g., adenomatous polyp, 
colorectal carcinoma, and other forms and/or stages of colon cancer). 

"Sample" or "biological sample" as used throughout here are generally 
meant to refer to samples of biological fluids or tissues, particularly samples obtained 
5 from tissues, especially from cells of the type associated with the disease for which the 
diagnostic application is designed (e.g., ductal adenocarcinoma), and the like. 
"Samples" is also meant to encompass derivatives and fractions of such samples (e.g., 
cell lysates). Where the sample is solid tissue, the cells of the tissue can be dissociated 
or tissue sections can be analyzed. 

10 Methods of the subject invention useful in diagnosis or prognosis 

typically involve comparison of the abundance of a selected differentially expressed 
gene product in a sample of interest with that of a control to determine any relative 
differences in the expression of the gene product, where the difference can be measured 
qualitatively and/or quantitatively. Quantitation can be accomplished, for example, by 

15 comparing the level of expression product detected in the sample with the amounts of 
product present in a standard curve. A comparison can be made visually; by using a 
technique such as densitometry, with or without computerized assistance; by preparing 
a representative library of cDNA clones of mRNA isolated from a test sample, 
sequencing the clones in the library to determine that number of cDNA clones 

20 corresponding to the same gene product, and analyzing the number of clones 
corresponding to that same gene product relative to the number of clones of the same 
gene product in a control sample; or by using an array to detect relative levels of 
hybridization to a selected sequence or set of sequences, and comparing the 
hybridization pattern to that of a control. The differences in expression are then 

25 correlated with the presence or absence of an abnormal expression pattern. A variety of 
different methods for determining the nucleic acid abundance in a sample are known to 
those of skill in the art (see, e.g., WO 97/2731 7) Jn general, diagnostic assays of the 
invention involve detection of a gene product of a the polynucleotide sequence (e.g., 
mRNA or polypeptide) that corresponds to a sequence of SEQ ID NOs: 1-3351. The 

30 patient from whom the sample is obtained can be apparently healthy, susceptible to 
disease (e.g., as deteimined by family history or exposure to certain environmental 
factors), or can already be identified as having a condition in which altered expression 
of a gene product of the invention is implicated. 

Diagnosis can be determined based on detected gene product expression 

35 levels of a gene product encoded by at least one, preferably at least two or more, at least 
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3 or more, or at least 4 or more of the polynucleotides having a sequence set forth in 
SEQ ID NOs: 1-3351, and can involve detection of expression of genes corresponding to 
all of SEQ ID NOs: 1-3351 and/or additional sequences that can serve as additional 
diagnostic markers and/or reference sequences. Where the diagnostic method is 
5 designed to detect the presence or susceptibility of a patient to cancer, the assay 
preferably involves detection of a gene product encoded by a gene corresponding to a 
polynucleotide that is differentially expressed in cancer. Examples of such differentially 
expressed polynucleotides are described in the Examples below. Given the provided 
polynucleotides and information regarding their relative expression levels provided 

10 herein, assays using such polynucleotides and detection of their expression levels in 
diagnosis and prognosis will be readily apparent to the ordinarily skilled artisan. 

Any of a variety of detectable labels can be used in connection with the 
various embodiments of the diagnostic methods of the invention. Suitable detectable 
labels include fluorochromes, (e.g., fluorescein isothiocyanate (FITC), rhodamine, 

15 Texas Red, phycoerythrin, ailophycocyanin, 6-carboxy fluorescein (6-FAM), 
2',7 , -dimethoxy-4',5 , -dichIoro-6-carboxyfluorescein, 6-carboxy-X-rhodamine (ROX), 
6-carboxy-2\4\7\4,74iexachlorofluorescein (HEX), 5-carboxyfiuorescein (5-FAM) or 
N,N,N\N'-tetramethyl-6-carboxyrhodamine (TAMRA)), radioactive labels, (e.g., 32 P, 
35 S, 3 H, etc.), and the like. The detectable label can involve a two stage systems (e.g,, 

20 biotin-avidin, hapten-anti-hapten antibody, etc.) 

Reagents specific for the polynucleotides and polypeptides of the 
invention, such as antibodies and nucleotide probes, can be supplied in a kit for 
detecting the presence of an expression product in a biological sample. The kit can also 
contain buffers or labeling components, as well as instructions for using the reagents to 

25 detect and quantify expression products in the biological sample. Exemplary 
embodiments of the diagnostic methods of the invention are described below in more 
detail. 

Polypeptide detection in diagnosis . In one embodiment, the test sample 
is assayed for the level of a differentially expressed polypeptide. Diagnosis can be 

30 accomplished using any of a number of methods to determine the absence or presence 
or altered amounts of the differentially expressed polypeptide in the test sample. For 
example, detection can utilize staining of cells or histological sections with labeled 
antibodies, performed in accordance with conventional methods. Cells can be 
permeabilized to stain cytoplasmic molecules. In general, antibodies that specifically 

35 bind a differentially expressed polypeptide of the invention are added to a sample, and 
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incubated for a period of time sufficient to allow binding to the epitope, usually at least 
about 10 minutes, The antibody can be detectably labeled for direct detection (e.g., 
using radioisotopes, enzymes, fluorescers, chemiluminescers, and the like), or can be 
used in conjunction with a second stage antibody or reagent to detect binding (e.g., 
5 biotin with horseradish peroxidase-conjugated avidin, a secondary antibody conjugated 
to a fluorescent compound, e.g., fluorescein, rhodamine, Texas red, etc.). The absence 
or presence of antibody binding can be determined by various methods, including flow 
cytometry of dissociated cells, microscopy, radiography, scintillation counting, etc. 
Any suitable alternative methods can of qualitative or quantitative detection of levels or 

10 amounts of differentially expressed polypeptide can be used, for example ELISA, 
western blot, immunoprecipitation, radioimmunoassay, etc. 

mRNA detection . The diagnostic methods of the invention can also or 
alternatively involve detection of mRNA encoded by a gene corresponding to a 
differentially expressed polynucleotides of the invention. Any suitable qualitative or 

15 quantitative methods known in the art for detecting specific mRNAs can be used. 
mRNA can be detected by, for example, in situ hybridization in tissue sections, by 
reverse transcriptase-PCR, or in Northern blots containing poly A+ mRNA, One of 
skill in the art can readily use these methods to determine differences in the size or 
amount of mRNA transcripts between two samples. mRNA expression levels in a 

20 sample can also be determined by generation of a library of expressed sequence tags 
(ESTs) from the sample, where the EST library is representative of sequences present in 
the sample (Adams, et aL, (1991) Science 252:1651), Enumeration of the relative 
representation of ESTs within the library can be used to approximate the relative 
representation of the gene transcript within the starting sample. The results of EST 

25 analysis of a test sample can then be compared to EST analysis of a reference sample to 
determine the relative expression levels of a selected polynucleotide, particularly a 
polynucleotide corresponding to one or more of the differentially expressed genes 
described herein. Alternatively, gene expression in a test sample can be performed 
using serial analysis of gene expression (SAGE) methodology (e.g., Velculescu et aL, 

30 Science (1995) 270:484) or differential display (DD) methodology (see, e.g., U.S. 
Patent NOs. 5,776,683 and 5,807,680). 

Alternatively, gene expression can be analyzed using hybridization 
analysis. Oligonucleotides or cDNA can be used to selectively identify or capture DNA 
or RNA of specific sequence composition, and the amount of RNA or cDNA hybridized 

35 to a known capture sequence determined qualitatively or quantitatively, to provide 

ye 
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information about the relative representation of a particular message within the pool of 
cellular messages in a sample. Hybridization analysis can be designed to allow for 
concurrent screening of the relative expression of hundreds to thousands of genes by 
using, for example, array-based technologies having high density formats, including 
5 filters, microscope slides, or microchips, or solution-based technologies that use 
spectroscopic analysis (e.g., mass spectrometry). One exemplary use of arrays in the 
diagnostic methods of the invention is described below in more detail. 

Use of a sin gle gene in diagnostic applications . The diagnostic methods 
of the invention can focus on the expression of a single differentially expressed gene, 

10 For example, the diagnostic method can involve detecting a differentially expressed 
gene, or a polymorphism of such a gene (e.g., a polymorphism in an coding region or 
control region), that is associated with disease. Disease-associated polymorphisms can 
include deletion or truncation of the gene, mutations that alter expression level and/or 
affect activity of the encoded protein, etc. 

15 A number of methods are available for analyzing nucleic acids for the 

presence of a specific sequence, e.g., a disease associated polymorphism. Where large 
amounts of DNA are available, genomic DNA is used directly. Alternatively, the 
region of interest is cloned into a suitable vector and grown in sufficient quantity for 
analysis. Cells that express a differentially expressed gene can be used as a source of 

20 mRNA, which can be assayed directly or reverse transcribed into cDNA for analysis. 
The nucleic acid can be amplified by conventional techniques, such as the polymerase 
chain reaction (PGR), to provide sufficient amounts for analysis, and a detectable label 
can be included in the amplification reaction (e.g., using a detectably labeled primer or 
detectably labeled oligonucleotides) to facilitate detection. Alternatively, various 

25 methods are also known in the art that utilize oligonucleotide ligation as a means of 
detecting polymorphisms, see e.g., Riley et al., Nucl Acids Res. (1990) 75:2887; and 
Delahunty et al., Am. 1 Hum, Genet. ( 1 996) 58: 1 239. 

The amplified or cloned sample nucleic acid can be analyzed by one of a 
number of methods known in the art. The nucleic acid can be sequenced by dideoxy or 

30 other methods, and the sequence of bases compared to a selected sequence, e.g., to a 
wild-type sequence. Hybridization with the polymorphic or variant sequence can also 
be used to determine its presence in a sample (e.g., by Southern blot, dot blot, etc.). The 
hybridization pattern of a polymorphic or variant sequence and a control sequence to an 
array of oligonucleotide probes immobilized on a solid support, as described in U.S. 

35 Patent No. 5,445,934, or in WO 95/35505, can also be used as a means of identifying 
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polymorphic or variant sequences associated with disease. Single strand 
conformational polymorphism (SSCP) analysis, denaturing gradient gel electrophoresis 
(DGGE), and heteroduplex analysis in gel matrices are used to detect conformational 
changes created by DNA sequence variation as alterations in electrophoretic mobility. 
5 Alternatively, where a polymorphism creates or destroys a recognition site for a 
restriction endonuclease, the sample is digested with that endonuclease, and the 
products size fractionated to determine whether the fragment was digested. 
Fractionation is performed by gel or capillary electrophoresis, particularly acrylamide or 
agarose gels, 

10 Screening for mutations in a gene can be based on the functional or 

antigenic characteristics of the protein. Protein truncation assays are useful in detecting 
deletions that can affect the biological activity of the protein. Various immunoassays 
designed to detect polymorphisms in proteins can be used in screening. Where many 
diverse genetic mutations lead to a particular disease phenotype, functional protein 

15 assays have proven to be effective screening tools. The activity of the encoded protein 
can be determined by comparison with the wild-type protein. 

Pattern matching in diagnosis using arrays . In another embodiment, the 
diagnostic and/or prognostic methods of the invention involve detection of expression 
of a selected set of genes in a test sample to produce a test expression pattern (TEP). 

20 The TEP is compared to a reference expression pattern (REP), which is generated by 
detection of expression of the selected set of genes in a reference sample (e.g., a 
positive or negative control sample). The selected set of genes includes at least one of 
the genes of the invention, which genes correspond to the polynucleotide sequences of 
SEQ ID NOs: 1-3351. Of particular interest is a selected set of genes that includes genes 

25 differentially expressed in the disease for which the test sample is to be screened. 

"Reference sequences" or "reference polynucleotides" as used herein in 
the context of differential gene expression analysis and diagnosis/prognosis refers to a 
selected set of polynucleotides, which selected set includes at least one or more of the 
differentially expressed polynucleotides described herein, A plurality of reference 

30 sequences, preferably comprising positive and negative control sequences, can be 
included as reference sequences. Additional suitable reference sequences are found in 
Genbank, Unigene, and other nucleotide sequence databases (including, e.g., expressed 
sequence tag (EST), partial, and full-length sequences). 

"Reference array" means an array having reference sequences for use in 

35 hybridization with a sample, where the reference sequences include all, at least one of, 
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or any subset of the differentially expressed polynucleotides described herein. Usually 
such an array will include at least 3 different reference sequences, and can include any 
one or all of the provided differentially expressed sequences. Arrays of interest can 
further comprise sequences, including polymorphisms, of other genetic sequences, 
5 particularly other sequences of interest for screening for a disease or disorder {e.g., 
cancer, dysplasia, or other related or unrelated diseases, disorders, or conditions). The 
oligonucleotide sequence on the array will usually be at least about 12 nt in length, and 
can be of about the length of the provided sequences, or can extend into the flanking 
regions to generate fragments of 100 nt to 200 nt in length or more. Reference arrays 

10 can be produced according to any suitable methods known in the art. For example, 
methods of producing large arrays of oligonucleotides are described in U*S, Patent NOs. 
5,134,854 and 5,445,934 using light-directed synthesis techniques. Using a computer 
controlled system, a heterogeneous array of monomers is converted, through 
simultaneous coupling at a number of reaction sites, into a heterogeneous array of 

15 polymers. Alternatively, microarrays are generated by deposition of pre-synthesized 
oligonucleotides onto a solid substrate, for example as described in PCT published 
application no. WO 95/35505. 

A "reference expression pattern" or "REP" as used herein refers to the 
relative levels of expression of a selected set of genes, particularly of differentially 

20 expressed genes, that is associated with a selected cell type, e.g., a normal cell, a 
cancerous cell, a cell exposed to an environmental stimulus, and the like. A "test 
expression pattern" or "TEP" refers to relative levels of expression of a selected set of 
genes, particularly of differentially expressed genes, in a test sample (e.g., a cell of 
unknown or suspected disease state, from which mRNA is isolated). 

25 REPs can be generated in a variety of ways according to methods well 

known in the art. For example, REPs can be generated by hybridizing a control sample 
to an array having a selected set of polynucleotides (particularly a selected set of 
differentially expressed polynucleotides), acquiring the hybridization data from the 
array, and storing the data in a format that allows for ready comparison of the REP with 

30 a TEP. Alternatively, all expressed sequences in a control sample can be isolated and 
sequenced, e.g., by isolating mRNA from a control sample, converting the mRNA into 
cDNA, and sequencing the cDNA. The resulting sequence information roughly or 
precisely reflects the identity and relative number of expressed sequences in the sample. 
The sequence information can then be stored in a format (e,g. y a computer-readable 

35 format) that allows for ready comparison of the REP with a TEP. The REP can be 
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normalized prior to or after data storage, and/or can be processed to selectively remove 
sequences of expressed genes that are of less interest or that might complicate analysis 
(e.g., some or all of the sequences associated with housekeeping genes can be 
eliminated from REP data). 
5 TEPs can be generated in a manner similar to REPs, e.g., by hybridizing 

a test sample to an array having a selected set of polynucleotides, particularly a selected 
set of differentially expressed polynucleotides, acquiring the hybridization data from the 
array, and storing the data in a format that allows for ready comparison of the TEP with 
a REP, The REP and TEP to be used in a comparison can be generated simultaneously, 

10 or the TEP can be compared to previously generated and stored REPs. 

In one embodiment of the invention, comparison of a TEP with a REP 
involves hybridizing a test sample with a reference array, where the reference array has 
one or more reference sequences for use in hybridization with a sample. The reference 
sequences include all, at least one of, or any subset of the differentially expressed 

1 5 polynucleotides described herein. Hybridization data for the test sample is acquired, the 
data normalized, and the produced TEP compared with a REP generated using an array 
having the same or similar selected set of differentially expressed polynucleotides. 
Probes that correspond to sequences differentially expressed between the two samples 
will show decreased or increased hybridization efficiency for one of the samples 

20 relative to the other. 

Methods for collection of data from hybridization of samples with a 
reference arrays are well known in the art. For example, the polynucleotides of the 
reference and test samples can be generated using a detectable fluorescent label, and 
hybridization of the polynucleotides in the samples detected by scanning the 

25 microarrays for the presence of the detectable label using, for example, a microscope 
and light source for directing light at a substrate. A photon counter detects fluorescence 
from the substrate, while an x-y translation stage varies the location of the substrate. A 
confocal detection device that can be used in the subject methods is described in LLS. 
Patent No. 5,631,734. A scanning laser microscope is described in Shalon et aL, 

30 Genome Res. (1996) (5:639. A scan, using the appropriate excitation line, is performed 
for each fluorophore used. The digital images generated from the scan are then 
combined for subsequent analysis. For any particular array element, the ratio of the 
fluorescent signal from one sample (e.g., a test sample) is compared to the fluorescent 
signal from another sample (e.g., a reference sample), and the relative signal intensity 

35 determined, 

no 
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Methods for analyzing the data collected from hybridization to arrays are 
well known in the art. For example, where detection of hybridization involves a 
fluorescent label, data analysis can include the steps of determining fluorescent intensity 
as a function of substrate position from the data collected, removing outliers, Le. 9 data 
5 deviating from a predetermined statistical distribution, and calculating the relative 
binding affinity of the targets from the remaining data. The resulting data can be 
displayed as an image with the intensity in each region varying according to the binding 
affinity between targets and probes. 

In general, the test sample is classified as having a gene expression 

10 profile corresponding to that associated with a disease or non-disease state by 
comparing the TEP generated from the test sample to one or more REPs generated from 
reference samples (e.g., from samples associated with cancer or specific stages of 
cancer, dysplasia, samples affected by a disease other than cancer, normal samples, 
etc). The criteria for a match or a substantial match between a TEP and a REP include 

15 expression of the same or substantially the same set of reference genes, as well as 
expression of these reference genes at substantially the same levels (e.g., no significant 
difference between the samples for a signal associated with a selected reference 
sequence after normalization of the samples, or at least no greater than about 25% to 
about 40% difference in signal strength for a given reference sequence. In general, a 

20 pattern match between a TEP and a REP includes a match in expression, preferably a 
match in qualitative or quantitative expression level, of at least one of, all or any subset 
of the differentially expressed genes of the invention. 

Pattern matching can be performed manually, or can be performed using 
a computer program. Methods for preparation of substrate matrices (e.g., arrays), 

25 design of oligonucleotides for use with such matrices, labeling of probes, hybridization 
conditions, scanning of hybridized matrices, and analysis of patterns generated, 
including comparison analysis, are described in, for example, U.S. Patent No. 
5,800,992. 

Diagnosis, Prognosis and Management of Cancer 
30 The polynucleotides of the invention and their gene products are of 

particular interest as genetic or biochemical markers (e.g., in blood or tissues) that will 
detect the earliest changes along the carcinogenesis pathway and/or to monitor the 
efficacy of various therapies and preventive interventions. For example, the level of 
expression of certain polynucleotides can be indicative of a poorer prognosis, and 
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therefore warrant more aggressive chemo- or radio-therapy for a patient or vice versa. 
The correlation of novel surrogate tumor specific features with response to treatment 
and outcome in patients can define prognostic indicators that allow the design of 
tailored therapy based on the molecular profile of the tumor. These therapies include 
5 antibody targeting and gene therapy. Determining expression of certain polynucleotides 
and comparison of a patients profile with known expression in normal tissue and 
variants of the disease allows a determination of the best possible treatment for a 
patient, both in terms of specificity of treatment and in terms of comfort level of the 
patient. Surrogate tumor markers, such as polynucleotide expression, can also be used 

10 to better classify, and thus diagnose and treat, different forms and disease states of 
cancer. Two classifications widely used in oncology that can benefit from identification 
of the expression levels of the polynucleotides of the invention are staging of the 
cancerous disorder, and grading the nature of the cancerous tissue. 

The polynucleotides of the invention can be useful to monitor patients 

15 having or susceptible to cancer to detect potentially malignant events at a molecular 
level before they are detectable at a gross morphological level. Furthermore, a 
polynucleotide of the invention identified as important for one type of cancer can also 
have implications for development or risk of development of other types of cancer, e.g., 
where a polynucleotide is differentially expressed across various cancer types. Thus, 

20 for example, expression of a polynucleotide that has clinical implications for metastatic 
colon cancer can also have clinical implications for stomach cancer or endometrial 
cancer. 

Staging . Staging is a process used by physicians to describe how 
advanced the cancerous state is in a patient. Generally, if a cancer is only detectable in 

25 the area of the primary lesion without having spread to any lymph nodes it is called 
Stage h If it has spread only to the closest lymph nodes, it is called Stage II. In Stage 
III, the cancer has generally spread to the lymph nodes in near proximity to the site of 
the primary lesion. Cancers that have spread to a distant part of the body, such as the 
liver, bone, brain or other site, are Stage IV, the most advanced stage. 

30 The polynucleotides of the invention can facilitate fine-tuning of the 

staging process by identifying markers for the aggresivity of a cancer, e.g., the 
metastatic potential, as well as the presence in different areas of the body. Thus, a Stage 
II cancer with a polynucleotide signifying a high metastatic potential cancer can be used 
to change a borderline Stage II tumor to a Stage III tumor, justifying more aggressive 
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therapy. Conversely, the presence of a polynucleotide signifying a lower metastatic 
potential allows more conservative staging of a tumor. 

Grading of cancers . Grade is a term used to describe how closely a 
tumor resembles normal tissue of its same type. The microscopic appearance of a tumor 
5 is used to identify tumor grade based on parameters such as cell morphology, cellular 
organization, and other markers of differentiation* As a general rule, the grade of a 
tumor corresponds to its rate of growth or aggressiveness, with undifferentiated or high- 
grade tumors being more aggressive than well differentiated or low-grade tumors. The 
following guidelines are generally used for grading tumors: 1) GX Grade cannot be 

10 assessed; 2) Gl Well differentiated; G2 Moderately well differentiated; 3) G3 Poorly 
differentiated; 4) G4 Undifferentiated. The polynucleotides of the invention can be 
especially valuable in determining the grade of the tumor, as they not only can aid in 
determining the differentiation status of the cells of a tumor, they can also identify 
factors other than differentiation that are valuable in determining the aggressivity of a 

1 5 tumor, such as metastatic potential. 

Detection of lung cancer . The polynucleotides of the invention can be 
used to detect lung cancer in a subject. Although there are more than a dozen different 
kinds of lung cancer, the two main types of lung cancer are small cell and nonsmall cell, 
which encompass about 90% of all lung cancer cases. Small cell carcinoma (also called 

20 oat cell carcinoma) usually starts in one of the larger bronchial tubes, grows fairly 
rapidly, and is likely to be large by the time of diagnosis. Nonsmall cell lung cancer 
(NSCLC) is made up of three general subtypes of lung cancer* Epidermoid carcinoma 
(also called squamous cell carcinoma) usually starts in one of the larger bronchial tubes 
and grows relatively slowly. The size of these tumors can range from very small to 

25 quite large. Adenocarcinoma starts growing near the outside surface of the lung and can 
vary in both size and growth rate. Some slowly growing adenocarcinomas are described 
as alveolar cell cancer. Large cell carcinoma starts near the surface of the lung, grows 
rapidly, and the growth is usually fairly large when diagnosed. Other less common 
forms of lung cancer are carcinoid, cylindroma, mucoepidermoid, and malignant 

30 mesothelioma. 

The polynucleotides of the invention, e.g., polynucleotides differentially 
expressed in normal cells versus cancerous lung cells (e.g., tumor cells of high or low 
metastatic potential) or between types of cancerous lung cells (e.g., high metastatic 
versus low metastatic), can be used to distinguish types of lung cancer as well as 
35 identifying traits specific to a certain patient's cancer and selecting an appropriate 
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therapy. For example, if the patient's biopsy expresses a polynucleotide that is 
associated with a low metastatic potential, it may justify leaving a larger portion of the 
patient's lung in surgery to remove the lesion. Alternatively, a smaller lesion with 
expression of a polynucleotide that is associated with high metastatic potential may 
5 justify a more radical removal of lung tissue and/or the surrounding lymph nodes, even 
if no metastasis can be identified through pathological examination. 

Detection of breast cancer . The majority of breast cancers are 
adenocarcinomas subtypes, which can be summarized as follows: 1) ductal carcinoma 
in situ (DCIS), including comedocarcinoma; 2) infiltrating (or invasive) ductal 

10 carcinoma GDC); 3) lobular carcinoma in situ (LCIS); 4) infiltrating (or invasive) 
lobular carcinoma (ILC); 5) inflammatory breast cancer; 6) medullary carcinoma; 
7) mucinous carcinoma; 8)Paget's disease of the nipple; 9)Phyllodes tumor; and 
10) tubular carcinoma. 

The expression of polynucleotides of the invention can be used in the 

15 diagnosis and management of breast cancer, as well as to distinguish between types of 
breast cancer. Detection of breast cancer can be determined using expression levels of 
any of the appropriate polynucleotides of the invention, either alone or in combination. 
Determination of the aggressive nature and/or the metastatic potential of a breast cancer 
can also be determined by comparing levels of one or more polynucleotides of the 

20 invention and comparing levels of another sequence known to vary in cancerous tissue, 
e.g., ER expression. In addition, development of breast cancer can be detected by 
examining the ratio of expression of a differentially expressed polynucleotide to the 
levels of steroid hormones (e.g., testosterone or estrogen) or to other hormones (e.g., 
growth hormone, insulin). Thus expression of specific marker polynucleotides can be 

25 used to discriminate between normal and cancerous breast tissue, to discriminate 
between breast cancers with different cells of origin, to discriminate between breast 
cancers with different potential metastatic rates, etc. 

Detection of colon cancer . The polynucleotides of the invention 
exhibiting the appropriate expression pattern can be used to detect colon cancer in a 

30 subject, Colorectal cancer is one of the most common neoplasms in humans and 
perhaps the most frequent form of hereditary neoplasia. Prevention and early detection 
are key factors in controlling and curing colorectal cancer, Colorectal cancer begins as 
polyps, which are small, benign growths of cells that form on the inner lining of the 
colon. Over a period of several years, some of these polyps accumulate additional 

35 mutations and become cancerous. Multiple familial colorectal cancer disorders have 
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been identified, which are summarized as follows: 1) Familial adenomatous polyposis 
(FAP); 2) Gardner's syndrome; 3) Hereditary nonpolyposis colon cancer (HNPCC); and 
4) Familial colorectal cancer in Ashkenazi Jews. The expression of appropriate 
polynucleotides of the invention can be used in the diagnosis, prognosis and 
5 management of colorectal cancer. Detection of colon cancer can be determined using 
expression levels of any of these sequences alone or in combination with the levels of 
expression. Determination of the aggressive nature and/or the metastatic potential of a 
colon cancer can be determined by comparing levels of one or more polynucleotides of 
the invention and comparing total levels of another sequence known to vary in 

10 cancerous tissue, e.g., expression of p53, DCC ras, lor FAP (see, e.g., Fearon ER, et al., 
Cell (1990) 57(J):759; Hamilton SR et aL, Cancer (1993) 72:957; Bodmer W, et al., 
Nat Genet. (1994) 4(J):217; Fearon ER, Ann N Y Acad Sci. (1995) 768.101). For 
example, development of colon cancer can be detected by examining the ratio of any of 
the polynucleotides of the invention to the levels of oncogenes (e.g., ras) or tumor 

15 suppressor genes (e.g, FAP or p53). Thus expression of specific marker 
polynucleotides can be used to discriminate between normal and cancerous colon tissue, 
to discriminate between colon cancers with different cells of origin, to discriminate 
between colon cancers with different potential metastatic rates, etc. 

Use of Polynucleotides to Screen for Peptide Analogs and Antagonists 
20 Polypeptides encoded by the instant polynucleotides and corresponding 

full length genes can be used to screen peptide libraries to identify binding partners, 
such as receptors, from among the encoded polypeptides. Peptide libraries can be 
synthesized according to methods known in the art (see, e.g., U.S. Patent No. 5,010,175, 
and WO 91/17823), Agonists or antagonists of the polypeptides if the invention can be 
25 screened using any available method known in the art, such as signal transduction, 
antibody binding, receptor binding, mitogenic assays, chemotaxis assays, etc. The 
assay conditions ideally should resemble the conditions under which the native activity 
is exhibited in vivo, that is, under physiologic pH, temperature, and ionic strength. 
Suitable agonists or antagonists will exhibit strong inhibition or enhancement of the 
30 native activity at concentrations that do not cause toxic side effects in the subject. 
Agonists or antagonists that compete for binding to the native polypeptide can require 
concentrations equal to or greater than the native concentration, while inhibitors capable 
of binding irreversibly to the polypeptide can be added in concentrations on the order of 
the native concentration. 
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Such screening and experimentation can lead to identification of a novel 
polypeptide binding partner, such as a receptor, encoded by a gene or a cDNA 
corresponding to a polynucleotide of the invention, and at least one peptide agonist or 
antagonist of the novel binding partner. Such agonists and antagonists can be used to 
5 modulate, enhance, or inhibit receptor function in cells to which the receptor is native, 
or in cells that possess the receptor as a result of genetic engineering. Further, if the 
novel receptor shares biologically important characteristics with a known receptor, 
information about agonist/antagonist binding can facilitate development of improved 
agonists/antagonists of the known receptor. 



10 Pharmaceutical Compositions and Therapeutic Uses 

Pharmaceutical compositions of the invention can comprise 
polypeptides, antibodies, or polynucleotides (including antisense nucleotides and 
ribozymes) of the claimed invention in a therapeutically effective amount. The term 
"therapeutically effective amount" as used herein refers to an amount of a therapeutic 

15 agent to treaty ameliorate, or prevent a desired disease or condition, or to exhibit a 
detectable therapeutic or preventative effect. The effect can be detected by, for 
example, chemical markers or antigen levels. Therapeutic effects also include reduction 
in physical symptoms, such as decreased body temperature. The precise effective 
amount for a subject will depend upon the subject's size and health, the nature and 

20 extent of the condition, and the therapeutics or combination of therapeutics selected for 
administration. Thus, it is not useful to specify an exact effective amount in advance. 
However, the effective amount for a given situation is determined by routine 
experimentation and is within the judgment of the clinician. For purposes of the present 
invention, an effective dose will generally be from about 0.01 mg/ kg to 50 mg/kg or 

25 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is 
administered. 

A pharmaceutical composition can also contain a pharmaceutical^ 
acceptable carrier. The term "pharmaceutically acceptable carrier" refers to a carrier for 
administration of a therapeutic agent, such as antibodies or a polypeptide, genes, and 
30 other therapeutic agents. The term refers to any pharmaceutical carrier that does not 
itself induce the production of antibodies harmful to the individual receiving the 
composition, and which can be administered without undue toxicity. Suitable carriers 
can be large, slowly metabolized macromolecules such as proteins, polysaccharides, 
polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
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and inactive virus particles. Such carriers are well known to those of ordinary skill in 
the art. Pharmaceutically acceptable carriers in therapeutic compositions can include 
liquids such as water, saline, glycerol and ethanol. Auxiliary substances, such as 
wetting or emulsifying agents, pH buffering substances, and the like, can also be present 
5 in such vehicles. Typically, the therapeutic compositions are prepared as injectables, 
either as liquid solutions or suspensions; solid forms suitable for solution in, or 
suspension in, liquid vehicles prior to injection can also be prepared. Liposomes are 
included within the definition of a pharmaceutically acceptable carrier. 
Pharmaceutically acceptable salts can also be present in the pharmaceutical 

10 composition, e.g., mineral acid salts such as hydrochlorides, hydrobromides, 
phosphates, sulfates, and the like; and the salts of organic acids such as acetates, 
propionates, malonates, benzoates, and the like. A thorough discussion of 
pharmaceutically acceptable excipients is available in Remington's Pharmaceutical 
Sciences (Mack Pub. Co., New Jersey, 1991). 

15 Delivery Methods . Once formulated, the compositions of the invention 

can be (1) administered directly to the subject (e.g., as polynucleotide or polypeptides); 
or (2) delivered ex vivo, to cells derived from the subject (e.g., as in ex vivo gene 
therapy). Direct delivery of the compositions will generally be accomplished by 
parenteral injection, e.g., subcutaneously, intraperitoneal^, intravenously or 

20 intramuscularly, intratumoral or to the interstitial space of a tissue. Other modes of 
administration include oral and pulmonary administration, suppositories, and 
transdermal applications, needles, and gene guns or hyposprays. Dosage treatment can 
be a single dose schedule or a multiple dose schedule. 



25 into a subject are known in the art and described in e.g., International Publication No. 
WO 93/14778. Examples of cells useful in ex vivo applications include, for example, 
stem cells, particularly hematopoetic, lymph cells, macrophages, dendritic cells, or 
tumor cells. Generally, delivery of nucleic acids for both ex vivo and in vitro 
applications can be accomplished by, for example, dextran-mediated transfection, 

30 calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, 
electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct 
microinjection of the DNA into nuclei, all well known in the art. 



found to correlate with a proliferative disorder, such as neoplasia, dysplasia, and 
35 hyperplasia, the disorder can be amenable to treatment by administration of a 



Methods for the ex vivo delivery and reimplantation of transformed cells 



Once a gene corresponding to a polynucleotide of the invention has been 
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therapeutic agent based on the provided polynucleotide, corresponding polypeptide or 
other corresponding molecule (e.g., antisense, ribozyme, etc). 

The dose and the means of administration of the inventive 
pharmaceutical compositions are determined based on the specific qualities of the 
5 therapeutic composition, the condition, age, and weight of the patient, the progression 
of the disease, and other relevant factors. For example, administration of 
polynucleotide therapeutic compositions agents of the invention includes local or 
systemic administration, including injection, oral administration, particle gun or 
catheterized administration, and topical administration. Preferably, the therapeutic 

10 polynucleotide composition contains an expression construct comprising a promoter 
operably linked to a polynucleotide of at least 12, 22, 25, 30, or 35 contiguous nt of the 
polynucleotide disclosed herein. Various methods can be used to administer the 
therapeutic composition directly to a specific site in the body. For example, a small 
metastatic lesion is located and the therapeutic composition injected several times in 

1 5 several different locations within the body of tumor. Alternatively, arteries which serve 
a tumor are identified, and the therapeutic composition injected into such an artery, in 
order to deliver the composition directly into the tumor. A tumor that has a necrotic 
center is aspirated and the composition injected directly into the now empty center of 
the tumor. The antisense composition is directly administered to the surface of the 

20 tumor, for example, by topical application of the composition. X-ray imaging is used to 
assist in certain of the above delivery methods. 

Receptor-mediated targeted delivery of therapeutic compositions 
containing an antisense polynucleotide, subgenomic polynucleotides, or antibodies to 
specific tissues can also be used. Receptor-mediated DNA delivery techniques are 

25 described in, for example, Findeis et al., Trends Biotechnol (1993) 77:202; Chiou et ah, 
Gene Therapeutics: Methods And Applications Of Direct Gene Transfer (J.A. Wolff, 
ed.) (1994); Wu et al., 1 Biol Chem. (1988) 2dJ:621; Wu et al, X Biol Chem. (1994) 
269:542; Zenke et al., Proc. Natl Acad Set (USA) (1990) 57:3655; Wu et al., 1 Biol 
Chem, (1991) 266:33%* Therapeutic compositions containing a polynucleotide are 

30 administered in a range of about 100 ng to about 200 mg of DNA for local 
administration in a gene therapy protocol. Concentration ranges of about 500 ng to 
about 50 mg, about 1 mg to about 2 mg, about 5 mg to about 500 mg, and about 20 mg 
to about 100 mg of DNA can also be used during a gene therapy protocol. Factors such 
as method of action (e.g., for enhancing or inhibiting levels of the encoded gene 

35 product) and efficacy of transformation and expression are considerations which will 

Hi 
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affect the dosage required for ultimate efficacy of the antisense subgenomic 
polynucleotides. Where greater expression is desired over a larger area of tissue, larger 
amounts of antisense subgenomic polynucleotides or the same amounts readministered 
in a successive protocol of administrations, or several administrations to different 
5 adjacent or close tissue portions of, for example, a tumor site, may be required to effect 
a positive therapeutic outcome. In all cases, routine experimentation in clinical trials 
will determine specific ranges for optimal therapeutic effect. For polynucleotide-related 
genes encoding polypeptides or proteins with anti-inflammatory activity, suitable use, 
doses, and administration are described in U S, Patent No. 5,654,173. 

10 The therapeutic polynucleotides and polypeptides of the present 

invention can be delivered using gene delivery vehicles. The gene delivery vehicle can 
be of viral or non- viral origin (see generally, Jolly, Cancer Gene Therapy (1994) 7:51; 
Kimura, Human Gene Therapy (1994) 5:845; Connelly, Human Gene Therapy (1995) 
7:185; and Kaplitt, Nature Genetics (1994) 6:148). Expression of such coding 

15 sequences can be induced using endogenous mammalian or heterologous promoters. 
Expression of the coding sequence can be either constitutive or regulated. 

Viral-based vectors for delivery of a desired polynucleotide and 
expression in a desired cell are well known in the art. Exemplary viral-based vehicles 
include, but are not limited to, recombinant retroviruses, (see, e.g., WO 90/07936; WO 

20 94/03622; WO 93/25698; WO 93/25234; U.S. Patent No. 5, 219,740; WO 93/11230; 
WO 93/10218; U.S. Patent No. 4,777,127; GB Patent No. 2,200,651; EP 0 345 242; and 
WO 91/02805), alphavirus-based vectors (e.g., Sindbis virus vectors, Semliki forest 
virus (ATCC VR-67; ATCC VR-1247), Ross River virus (ATCC VR-373; ATCC VR- 
1246) and Venezuelan equine encephalitis virus (ATCC VR-923; ATCC VR-1250; 

25 ATCC VR 1249; ATCC VR-532), and adeno-associated virus (AAV) vectors (see, e.g., 
WO 94/12649, WO 93/03769; WO 93/19191; WO 94/28938; WO 95/1 1984 and WO 
95/00655). Administration of DNA linked to killed adenovirus as described in Curiel, 
Hum. Gene Ther. (1992) J: 147 can also be employed. 

Non-viral delivery vehicles and methods can also be employed, 

30 including, but not limited to, polycationic condensed DNA linked or unlinked to killed 
adenovirus alone (see, e.g., Curiel, Hum. Gene Ther. (1992) 3:147); ligand-linked 
DNA(see, e.g., Wu, J. Biol Chem. 264:16985 (1989)); eukaryotic cell delivery vehicles 
celts (see, e.g., U.S. Patent No. 5,814,482; WO 95/07994; WO 96/1 7072; 
WO 95/30763; and WO 97/42338) and nucleic charge neutralization or fusion with cell 

35 membranes. Naked DNA can also be employed. Exemplary naked DNA introduction 
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methods are described in WO 90/1 1092 and U.S. Patent No. 5,580,859. Liposomes that 
can act as gene delivery vehicles are described in U.S. Patent No. 5,422,120; WO 
95/13796; WO 94/23697; WO 91/14445; and EP 0524968. Additional approaches are 
described in Philip, Mol Cell Biol 74:2411 (1994), and in Woffendin, Proa Natl. 
5 Acad ScL (1994) 91 A5il. 

Further non-viral delivery suitable for use includes mechanical delivery 
systems such as the approach described in Woffendin et al, Proc* Natl Acad Set. USA 
P/(24):l 1581 (1994). Moreover, the coding sequence and the product of expression of 
such can be delivered through deposition of photopolymerized hydrogel materials or 

10 use of ionizing radiation (see, e.g., U.S. Patent No. 5,206,152 and WO 92/11033). 
Other conventional methods for gene delivery that can be used for delivery of the 
coding sequence include, for example, use of hand-held gene transfer particle gun (see, 
e.g., U.S. Patent No. 5,149,655); use of ionizing radiation for activating transferred gene 
(see, e.g., U.S. Patent No. 5,206,152 and WO 92/11033). 

15 The present invention will now be illustrated by reference to the 

following examples which set forth particularly advantageous embodiments. However, 
it should be noted that these embodiments are illustrative and are not to be construed as 
restricting the invention in any way. 
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EXAMPLES 
EXAMPLE 1 

Source of Biological Materials and Overview of Novel Polynucleotides 
Expressed by the Biological Materials 

5 

Cell lines and human normal and tumor tissue were used to construct 
cDNA libraries from mRNA isolated from the cells and tissues. Most sequences were 
about 275-300 nucleotides in length. The cells lines include Kml2L4-A cell line, a 
high metastatic colon cancer cell line (Morika, W. A. K. et ah, Cancer Research (1988) 

10 48:6m). The KM12L4-A cell line is derived from the KM12C cell line. The KM12C 
cell line, which is poorly metastatic (low metastatic) was established in culture from a 
Dukes' stage B2 surgical specimen (Morikawa et al. Cancer Res. (1988) 48:&$63y The 
KML4-A is a highly metastatic subline derived from KM12C (Yeatman et al. Nucl 
Acids, Res. (1995) 25:4007; Bao-Ling et al. Proc. Annu. Meet. Am. Assoc. Cancer. Res. 

15 (1995) 2/:3269). The KM12C and KM12C-derived cell lines (e.g., KM12L4, 
KM12L4-A, etc J) are well-recognized in the art as model cell lines for the study of 
colon cancer (see, e.g., Moriakawa et al., supra; Radinsky et al Clin, Cancer Res, 
(1995) 1:19; Yeatman et al., (1995) supra; Yeatman et al., Clin Exp. Metastasis (1996) 
14:246). These and other cell lines and tissue are described in Table 6. 

20 The sequences of the isolated polynucleotides were first masked to 

eliminate low complexity sequences using the XBLAST masking program (Claverie 
"Effective Large-Scale Sequence Similarity Searches," In: Computer Methods for 
Macromolecular Sequence Analysis, Doolittle, ed., Meth. Enzymol 256:212-227 
Academic Press, NY, NY (1996); see particularly Claverie, in "Automated DNA 

25 Sequencing and Analysis Techniques" Adams et al., eds,, Chap. 36, p, 267 Academic 
Press, San Diego, 1994 and Claverie et al. Comput. Chem. (1993) 77:191 ). Generally, 
masking does not influence the final search results, except to eliminate sequences of 
relative little interest due to their low complexity, and to eliminate multiple "hits" based 
on similarity to repetitive regions common to multiple sequences, e.g., Alu repeats. The 

30 sequences remaining after masking were then used in a BLASTN vs. Genbank search; 
sequences that exhibited greater than 70% overlap, 99% identity, and a p value of less 
than 1 x 10* 40 were discarded. Sequences from this search also were discarded if the 
inclusive parameters were met, but the sequence was ribosomal or vector-derived. 

«6l 
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The resulting sequences from the previous search were classified into 
three groups (U 2 and 3 below) and searched in a BLASTX vs. NRP (non-redundant 
proteins) database search: (1) unknown (no hits in the Genbank search), (2) weak 
similarity (greater than 45% identity and p value of less than 1 x 10" 5 ), and (3) high 
5 similarity (greater than 60% overlap, greater than 80% identity, and p value less than 1 
x 10" 5 ), Sequences having greater than 70% overlap, greater than 99% identity, and p 
value of less than 1 x 10 40 were discarded. 

The remaining sequences were classified as unknown (no hits), weak 
similarity, and high similarity (parameters as above). Two searches were performed on 

10 these sequences. First, a BLAST vs. EST database search was performed and 
sequences with greater than 99% overlap, greater than 99% similarity and a p value of 
less than 1 x 10" 40 were discarded. Sequences with a p value of less than 1 x 10"* 5 when 
compared to a database sequence of human origin were also excluded. Second, a 
BLASTN vs. Patent GeneSeq database was performed and sequences having greater 

15 than 99% identity, p value less than 1 x 10 40 , and greater than 99% overlap were 
discarded. 

The remaining sequences were subjected to screening using other rules 
and redundancies in the dataset. Sequences with a p value of less than 1 x 10" m in 
relation to a database sequence of human origin were specifically excluded. The final 

20 result provided the 335 1 sequences listed in the accompanying Sequence Listing. Each 
identified polynucleotide represents sequence from at least a partial mRNA transcript. 
Polynucleotides that were determined to be novel were assigned a sequence 
identification number. 

The novel polynucleotides were assigned sequence identification numbers 

25 SEQ ID NOs: 1-3351. The first 1847 DNA sequences corresponding to the novel 
polynucleotides are provided in the Sequence Listing in Table I. DNA sequences 
corresponding to the novel polynucleotides of SEQ ID NOs: 1848-3351 are provided in the 
Sequence Listing in Table 2. The DNA sequences of Table 2, while numbered SEQ ID 1- 
1504, correspond to SEQ ID NOs: 1848-3351 in the Sequence Listing, e.g., Table 2 SEQ ID 

30 I is SEQ ID NO: 1 848, Table 2 SEQ ID 2 is SEQ ID NO: 1 849, etc. Each DNA sequence in 
Table 4 is uniquely identified by a number that is 1847 less than its SEQ ID NO in the 
Sequence Listing. Tables 1 and 2 provide: 1) the SEQ ID NO assigned to each sequence 
for use in the present specification or a corresponding number; 2) the sequence name used 
as an internal identifier of the sequence; 3) the name assigned to the clone from which the 
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sequence was isolated; and 4) the number of the cluster to which the sequence is assigned 
(Cluster ID; where the cluster ID is 0, the sequence was not assigned to any cluster). 

Because the provided polynucleotides represent partial mRNA 
transcripts, two or more polynucleotides of the invention may represent different 
5 regions of the same mRNA transcript and the same gene. Thus, if two or more SEQ ID 
NOs: are identified as belonging to the same clone, then either sequence can be used to 
obtain the full-length mRNA or gene. 

EXAMPLE 2 

Results of Public Database Search to Identify Function of Gene Products 

10 

SEQ ID NOs: 1-3351 were translated in all three reading frames to 
determine the best alignment with the individual sequences. These amino acid 
sequences and nucleotide sequences are referred to, generally, as query sequences, 
which are aligned with the individual sequences. Query and individual sequences were 

15 aligned using the BLAST programs, available over the world wide web at 
http://www,ncbi.nlm.nih.gov/BLAST/. Again the sequences were masked to various 
extents to prevent searching of repetitive sequences or poly-A sequences, using the 
XBLAST program for masking low complexity as described above in Example 1. 

Tables 3 and 4 (inserted before the claims) show the results of the 

20 alignments. Table 3 contains alignment information for SEQ ID NOs: 1-1 847 and Table 4 
contains alignment information for SEQ ID NOs:l 848-335 1 . The DNA sequences of Table 
4, while numbered SEQ ID 1-1504, correspond to SEQ ID NOs: 1 848-3351. Each DNA 
sequence in Table 4 is uniquely identified by a number that is 1847 less than its SEQ ID 
NO. Tables 3 and 4 refer to each sequence by its SEQ ID NO or a corresponding number, 

25 the accession numbers and descriptions of nearest neighbors from the Genbank and Non- 
Redundant Protein searches, and the p values of the search results. 

For each of SEQ ID NOs: 1-1 847, the best alignment to a protein or DNA 
sequence is included in Table 3, and the best alignment for each of SEQ ID NOs: 1 848- 
3351 is included in Table 4. The activity of the polypeptide encoded by SEQ ID 

30 NOs: 1-3351 is the same or similar to the nearest neighbor reported in Table 3 or 4, The 
accession number of the nearest neighbor is reported, providing a reference to the activities 
exhibited by the nearest neighbor, The search program and database used for the alignment 
also are indicated as well as a calculation of the p value. 
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Full length sequences or fragments of the polynucleotide sequences of 
the nearest neighbors can be used as probes and primers to identify and isolate the full 
length sequence of SEQ ID NOs: 1-3351, The nearest neighbors can indicate a tissue or 
cell type to be used to construct a library for the full-length sequences of SEQ ID 
5 NOs:l-3351. 

EXAMPLE 3 
Members of Protein Families 

The sequences (SEQ ID NOs: 1-3351) were used to conduct a profile 
10 search as described in the specification above. Several of the polynucleotides of the 
invention were found to encode polypeptides having characteristics of a polypeptide 
belonging to a known protein families (and thus represent new members of these 
protein families) and/or comprising a known functional domain (Table 5), "Start" and 
"stop" in Table 3 indicate the position within the individual sequences that align with 
15 the query sequence having the indicated SEQ ID NO, The direction indicates the 
orientation of the query sequence with respect to the individual sequence, where 
forward (for) indicates that the alignment is in the same direction (left to right) as the 
sequence provided in the Sequence Listing and reverse (rev) indicates that the 
alignment is with a sequence complementary to the sequence provided in the Sequence 
20 Listing. 

Some polynucleotides exhibited multiple profile hits because, for 
example, the particular sequence contains overlapping profile regions, and/or the 
sequence contains two different functional domains. These profile hits are described in 
more detail below. 

25 Ank Repeats (ANK) . SEQ ID NOs: 187, 1268, 1804, 1819, 1830, 1839, 

2652, 3015 and 3267 represent polynucleotides encoding an Ank repeat-containing 
protein. The ankyrin motif is a 33 amino acid sequence named for the protein ankyrin 
which has 24 tandem 33-amino-acid motifs. Ank repeats were originally identified in 
the cell-cycle-control protein cdclO (Breeden et al., Nature (1987) 32P:651). Proteins 

30 containing ankyrin repeats include ankyrin, myotropin, I-kappaB proteins, cell cycle 
protein cdclO, the Notch receptor (Matsuno et al., Development (1997) 124(21)A265); 
G9a (or BATS) of the class III region of the major histocompatibility complex 
(Biochem J. 290:811-818, 1993), FABP, GABP, 53BP2, Linl2, glp-1, SW14, and 
SW16. The functions of the ankyrin repeats are compatible with a role in protein- 

si 
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protein interactions (Bork, Proteins (1993) 77(4):363; Lambert and Bennet, Eur. J. 
Biochem. (1993) 277:1; Kerr et al., Current Op, Cell Biol (1992) 4:496; Bennet et al., 
1 Biol Chem. (1980) 255:6424). 

ATPases Associated with Various Cellular Activities (ATPasesV 
5 Sequences within SEQ ID NOs:431, 639, 2135, 2684, 2859, 3197 and 3266 correspond 
to a sequence that encodes a novel member of the "ATPases Associated with diverse 
cellular Activities" (AAA) protein family. The AAA protein family is composed of a 
large number of ATPases that share a conserved region of about 220 amino acids that 
contains an ATP-binding site (Froehlich et al., J. Cell Biol (1991) 7/4:443; Erdmann et 

10 al., Cell (1991) 64:499; Peters et aL, EMBO J. (1990) 9:1757; Kunau et al., Biochimie 
(1993) 75:209-224; Confalonieri et al, BioEssays (1995) 77:639; 
http://yeamob.pci.chemie.uni-tuebingen.de/AAA/Description.html). The proteins that 
belong to this family either contain one or two AAA domains. In general, the AAA 
domains in these proteins act as ATP-dependent protein clamps (Confalonieri et al. 

15 (1 995) BioEssays 7 7:639). In addition to the ATP-binding W and 'B* motifs, which are 
located in the N-terminal half of this domain, there is a highly conserved region located 
in the central part of the domain which was used in the development of the signature 
pattern. The consensus pattern is: [LIVMT]-x-(LIVMT]-[LIVMF]-x-[GATMC]-[ST]- 
[NS]-x(4)-[LIVM]- D~x-A-[LIFA]-x-R. 

20 Bromodomain (bromodomain) . SEQ ID NO: 1814 represents a 

polynucleotide encoding a polypeptide having a bromodomain region (Haynes et al,, 
1992, Nucleic Acids Res. 20:2693-2603, Tamkun et al., 1992, Cell 68:561-572, and 
Tamkun, 1995, Curr. Opin. Genet. Dev. 5:473-477), which is a conserved region of 
about 70 amino acids. The bromodomain is thought to be involved in protein-protein 

25 interactions and may be important for the assembly or activity of multicomponent 
complexes involved in transcriptional activation. The consensus pattern, which spans a 
major part of the bromodomain, is: [STANVF]-x(2)-F-x(4)-[DNS]-x(5,7)-[DENQTF]- 
Y-[HFY]-x(2)- [LIVMF Y>x(3)-[LI VM>x(4)-[LIVM]-x(6,8)-Y-x(l 2,1 3)-[LI VM]-x(2)- 
N-[SACF]-x(2MFY], 

30 Basic Region Plus Leucine Zipper Transcription Factors (BZIP) . SEQ 

IDNOs:410, 552, 768, 822, 836, 1288, 1365, 1454, 1540, 1549, 1556, 1557, 1563, 
1622, 1630, 1704, 1808, 2363, 2424, 3147, 3152, 3158 and 3208 represent 
polynucleotides encoding a novel member of the family of basic region plus leucine 
zipper transcription factors. The bZIP superfamily (Hurst, Protein Prof. (1995) 2:105; 

35 and Ellenberger, Curr. Opin. Struct, Biol (1994) 4:12) of eukaryotic DNA-binding 
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transcription factors encompasses proteins that contain a basic region mediating 
sequence-specific DNA-binding followed by a leucine zipper required for dimerization. 
The consensus pattern for this protein family is; [KR]-x(l,3>[RKSAQ]-N-x(2)- 
[SAQ](2)-x-[RKTAENQ]-x-R-x-[RK]. 
5 EF Hand (BFhand) . SEQ ID NOs:820, 1755 and 3285 correspond to 

polynucleotides encoding a novel protein in the family of EF-hand proteins. Many 
calcium-binding proteins belong to the same evolutionary family and share a type of 
calcium-binding domain known as the EF-hand (Kawasaki et al., Protein. Prof (1995) 
2:305-490). This type of domain consists of a twelve residue loop flanked on both sides 

10 by a twelve residue alpha-helical domain. In an EF-hand loop the calcium ion is 
coordinated in a pentagonal bipyramidal configuration. The six residues involved in the 
binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, 
-X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding 
Ca (bidentate ligand), The consensus pattern includes the complete EF-hand loop as 

15 well as the first residue which follows the loop and which seem to always be 
hydrophobic: D-x-[DNS]-{ILVFYW}-[DENSTG]-pNQGHRK]-{GP}-[LIVMC]- 
[DENQSTAGC]-x(2>[DE]-[LIVMFYW]. 

Ets Domain (Ets NtermV SEQ ID NO:181 1 represents a polynucleotide 
encoding a polypeptide with N-terminal homology in ETS domain* Proteins of this 

20 family contain a conserved domain, the "ETS-domain," that is involved in DNA 
binding. The domain appears to recognize purine-rich sequences; it is about 85 to 90 
amino acids in length, and is rich in aromatic and positively charged residues (Wasylyk, 
et al., Eur. J. Biochem. (1993) 277:718). The ets gene family encodes a novel class of 
DNA-binding proteins, each of which binds a specific DNA sequence and comprises an 

25 ets domain that specifically interacts with sequences containing the common core tri- 
nucleotide sequence GGA. In addition to an ets domain, native ets proteins comprise 
other sequences which can modulate the biological specificity of the protein, Ets genes 
and proteins are involved in a variety of essential biological processes including cell 
growth, differentiation and development, and three members are implicated in 

30 oncogenic process. 

G-Protein Alpha Subunit (G-alpha) . SEQ ID NO: 1846 represents a 
polynucleotide encoding a novel polypeptide of the G-protein alpha subunit family. 
Guanine nucleotide binding proteins (G-proteins) are a family of membrane-associated 
proteins that couple extracellularly-activated integral -membrane receptors to 

35 intracellular effectors, such as ion channels and enzymes that vary the concentration of 
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second messenger molecules. G-proteins are composed of 3 subunits (alpha, beta and 
gamma) which, in the resting state, associate as a trimer at the inner face of the plasma 
membrane. The alpha subunit binds GTP and exhibits GTPase activity. G-protein alpha 
subunits are 350-400 amino acids in length and have molecular weights in the range 40- 
5 45 kDa. Seventeen distinct types of alpha subunit have been identified in mammals, 
and fall into 4 main groups on the basis of both sequence similarity and function: alpha- 
s, alpha-q, alpha-i and alpha-12 (Simon et al., Science (1993) 252:802). They are often 
^terminally acylated, usually with myristate and/or palmitoylate, and these fatty acid 
modifications can be important for membrane association and high- affinity interactions 

1 0 with other proteins. 

Helicases conserved C-terminal domain (helicase CY SEQ ID 
NOs:1496, 2826 and 2871 represent polynucleotides encoding novel members of the 
DEAD/H helicase family. A number of eukaryotic and prokaryotic proteins have been 
characterized (Schmid S.R., et al.» Mol Microbiol (1992) 6:283; Under P., et al., 

15 Nature (1989) 337:121; Wassarman DA., et al., Nature (1991) 349:463) on the basis of 
their structural similarity. All are involved in ATP-dependent, nucleic-acid unwinding. 
All DEAD box family members of the above proteins share a number of conserved 
sequence motifs, some of which are specific to the DEAD family while others are 
shared by other ATP-binding proteins or by proteins belonging to the helicases 

20 'superfamily* (Hodgman T.C., Nature (1988) 333:22 and Nature (1988) 353:578 
(Errata). One of these motifs, called the "D-E-A-D-box", represents a special version of 
the B motif of ATP-binding proteins. Some other proteins belong to a subfamily which 
have His instead of the second Asp and are thus said to be "D-E-A-H-box" proteins 
(Wassarman D.A., et al,, Nature (1991) 349:463; Harosh L, et al.. Nucleic Acids Res. 

25 (1991) 79:6331; Koonin E.V. et al., J, Gen. Virol. (1992) 73:989. The following 
signature patterns are used to identify members of both subfamilies: 1) [LIVMF](2)-D- 
E- A-D- [RKEN]-x-[LI VMF YGSTN] ; and 2) [GSAH]-x-[LIVMF](3)-D-E-[ALIV]-H- 
[NECR], 

Homeobox domain (homeobox) . SEQ ID NOs:1676, 1820 and 1821 
30 represent polynucleotides encoding proteins having a homeobox domain. The 
homeobox is a protein domain of 60 amino acids (Gehring In: Guidebook to the 
Homeobox Genes . Duboule D., Ed., pp. 1-10, Oxford University Press, Oxford, (1994); 
Buerglin In: Guidebook to the Homeobox Genes , pp25-72, Oxford University Press, 
Oxford, (1994); Gehring, Trends Biochem. ScL (1992) 17:277-280; Gehring et al., 
35 Ann* Rev. Genet (1986) 20:147-173; Schofield, Trends Neurosci. (1987) 70:3-6) first 
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identified in a number of Drosophila homeotic and segmentation proteins. It is 
extremely well conserved in many other animals, including vertebrates. This domain 
binds DNA through a helix-turn-helix type of structure. Several proteins that contain a 
homeobox domain play an important role in development Most of these proteins are 
5 sequence-specific DNA-binding transcription factors. The homeobox domain is also 
very similar to a region of the yeast mating type proteins. These are sequence-specific 
DNA-binding proteins that act as master switches in yeast differentiation by controlling 
gene expression in a cell type-specific fashion. 

A schematic representation of the homeobox domain is shown below. 
1 0 The helix-turn-helix region is shown by the symbols "FT (for helix), and Y (for turn). 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxHHHHHHHHtttHHHHHHHHHxxxxxxxxxx 
1 60 

15 The pattern detects homeobox sequences 24 residues long and spans 

positions 34 to 57 of the homeobox domain. The consensus pattern is as follows: 

|UVMFYGHASLVR]-x(2MLIVM^ 
[LIVFSTNKH]-W-[FYVC]^ 

MAP kinase kinase (mkk) . SEQ ID NOs:29, 31, 196, 3175, 3190 and 

20 3281 represent novel members of the MAP kinase kinase family. MAP kinases 
(MAPK) are involved in signal transduction, and are important in cell cycle and cell 
growth controls. The MAP kinase kinases (MAPKK) are dual-specificity protein 
kinases which phosphorylate and activate MAP kinases. MAPKK homologues have 
been found in yeast, invertebrates, amphibians, and mammals. Moreover, the 

25 M APKK/MAPK phosphorylation switch constitutes a basic module activated in distinct 
pathways in yeast and in vertebrates. MAPKKs are essential transducers through which 
signals must pass before reaching the nucleus. For review, see, e.g., Biologique Biol 
Cell (1993) 79:193-207; Nishida et al., Trends Biochem Sci (1993) 75:128-31; 
Ruderman, Curr Opin Celt Biol (1993) 5:207-13; Dhanasekaran et al, Oncogene (1998) 

30 77:1447-55; Kieferetal., Biochem Soc Trans (1997) 25:491-8; and Hill, Cell Signal 
(1996) 5:533-44. 

Protein Kinase (protkinase) . SEQ ID NOsrl 157, 1478, 1496, 2286, 2969 
and 3190 represent polynucleotides encoding protein kinases. Protein kinases catalyze 
phosphorylation of proteins in a variety of pathways, and are implicated in cancer. 
35 Eukaryotic protein kinases (Hanks S.K., et aL, FASEB J. (1995) 9:576; Hunter T., Metk 
Enzymol. (1991) 200:3; Hanks S.K., et al., Metk Enzymol (1991) 200:38; Hanks S.K., 
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Curr. Opin Struct, Biol (1991) /:369; Hanks S.K. et al., Science (1988) 241:42) are 
enzymes that belong to a very extensive family of proteins which share a conserved 
catalytic core common to both serine/threonine and tyrosine protein kinases. There are 
a number of conserved regions in the catalytic domain of protein kinases. The first 
5 region, which is located in the N-terminal extremity of the catalytic domain, is a 
glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown 
to be involved in ATP binding. The second region, which is located in the central part 
of the catalytic domain, contains a conserved aspartic acid residue which is important 
for the catalytic activity of the enzyme (Knighton D.R. et al., Science (1991) 255:407). 

10 The protein kinase profile includes two signature patterns for this second region: one 
specific for serine/threonine kinases and the other for tyrosine kinases* A third profile 
is based on the alignment in (Hanks S.K. et al., FASEB J. (1995) 9:576) and covers the 
entire catalytic domain. 

The consensus patterns are as follows: 1) [LIV]-G-{P}-G-{P}- 

15 [FYWMGSTNH]-[SGAHPW} 

[LIVMFYWCSTAR] [AIVP]-[LIVMFAGCKR]^K, where K binds ATP; 2) 
[LIVMFYC]-x-[HY]-x-D-[LIVMFYl-K-x(2)-N-[LIVMFYCT](3) > where D is an active 
site residue; and 3) [LIVMFYC]-x^[HY]-x-D-[LIVMFY]-[RSTAC]-x(2)-N^ 
[LIVMFYC], where D is an active site residue. 

20 If a protein analyzed includes two of the above protein kinase signatures, 

the probability of it being a protein kinase is close to 100%. 

Ras family proteins (ras) . SEQ ID NOs:1688 and 3258 represent 
polynucleotides encoding novel members of the ras family of small GTP/GDP-binding 
proteins (Valencia et al., 1991, Biochemistry 30:4637-4648). Ras family members 

25 generally require a specific guanine nucleotide exchange factor (GEF) and a specific 
GTPase activating protein (GAP) as stimulators of overall GTPase activity. Among 
ras-related proteins, the highest degree of sequence conservation is found in four 
regions that are directly involved in guanine nucleotide binding. The first two 
constitute most of the phosphate and Mg2+ binding site (PM site) and are located in the 

30 first half of the G-domain. The other two regions are involved in guanosine binding and 
are located in the C-terminal half of the molecule. Motifs and conserved structural 
features of the ras-related proteins are described in Valencia et al., 1991, Biochemistry 
30:4637-4648. A major consensus pattern of ras proteins is: D-T-A-G-Q-E-K-[LF)-G- 
G-L-R-[DE]-G-Y-Y. 
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Thioredoxin family active site (Thioredox) . SEQ ID NO: 1 677 represents 
a polynucleotide encoding a protein having a thioredoxin family active site. 
Thioredoxins (Holmgren A t> Annu. Rev. Biochem. (1985) 54:237; Gleason F.K. et al, 
FEMS Microbiol. Rev. (1988) 5*271; Holmgren, A. J. Biol Chem. (1989) 254:13963; 
5 Eklund H. et aL, Proteins (1991) 77:13) are small proteins of approximately one 
hundred amino- acid residues which participate in various redox reactions via the 
reversible oxidation of an active center disulfide bond. They exist in either a reduced 
form or an oxidized form where the two cysteine residues are linked in an 
intramolecular disulfide bond. Thioredoxin is present in prokaryotes and eukaryotes 
10 and the sequence around the redox-active disulfide bond is well conserved. All PDI 
contains two or three (ERp72) copies of the thioredoxin domain. The consensus pattern 
is: |TIVMFHUVMSTA]-x-[LIV^ 

[GATPLVE]-[PHYWSTA]-C-x(6)-[LIVMFYWT] (where the two C$ form the redox- 
active bond). 

15 Trypsin (trypsin) . SEQ ID NO: 1410 corresponds to a novel serine 

protease of the trypsin family. The catalytic activity of the serine proteases from the 
trypsin family is provided by a charge relay system involving an aspartic acid residue 
hydrogen-bonded to a histidine, which itself is hydrogen-bonded to a serine. The 
sequences in the vicinity of the active site serine and histidine residues are well 

20 conserved in this family of proteases (Brenner S., Nature (1988) 534:528), The 
consensus patterns for this trypsin protein family are: 1) [UVM]-[ST]-A-[STAG]-H-C, 
where H is the active site residue; and 2) [DNSTAGC]-[GSTAPIMVQH]-x(2>G-[DE]- 
S-G-[GS]-[SAPHV]- [LIVMFYWH]-[LIVMFYSTANQH], where S is the active site 
residue. All sequences known to belong to this family are detected by the above 

25 consensus sequences, except for 18 different proteases which have lost the first 
conserved glycine. If a protein includes both the serine and the histidine active site 
signatures, the probability of it being a trypsin family serine protease is 100%. 

WD Domain. G-Beta Repeats (WD domain) . SEQ IDNOs:1336, 1380, 
1711, 1762, 1909, 2218, 3047, 3108 and 3292 represent novel members of the WD 

30 domain/G-beta repeat family. Beta-transducin (G-beta) is one of the three subunits 
(alpha, beta, and gamma) of the guanine nucleotide-binding proteins (G proteins) which 
act as intermediaries in the transduction of signals generated by transmembrane 
receptors (Gilman, Annu. Rev, Biochem. (1987) 55:615). The alpha subunit binds to 
and hydrolyzes GTP; the functions of the beta and gamma subunits are less clear but 

35 they seem to be required for the replacement of GDP by GTP as well as for membrane 
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anchoring and receptor recognition. In higher eukaryotes, G-beta exists as a small 
multigene family of highly conserved proteins of about 340 amino acid residues. 
Structurally, G-beta consists of eight tandem repeats of about 40 residues, each 
containing a central Trp-Asp motif (this type of repeat is sometimes called a WD-40 
5 repeat). The consensus pattern for the WD domain/G-Beta repeat family is: 
[LIVMSTAC]-[LIVMFYWSTAGC].[UMSTAG].[LIVMSTAGC]-x(2)-|pN]-x(2)- 
[LIVMWSTAC]-x-[LIVMFSTAG]-W-[DEN]-[LIVMFSTAGCN]. 

wnt Family of Developmental Signaling Proteins (Wnt dev sienV SEQ 
ID NO: 1538 corresponds to a novel member of the wnt family of developmental 

10 signaling proteins. Wnt-1 (previously known as int-1), the seminal member of this 
family, (Nusse R., Trends Genet (1988) 4:291) is thought to play a role in intercellular 
communication and seems to be a signalling molecule important in the development of 
the central nervous system (CNS). All wnt family proteins share the following features 
characteristics of secretory proteins: a signal peptide, several potential N-glycosylation 

1 5 sites and 22 conserved cysteines that are probably involved in disulfide bonds. The 
Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are 
therefore likely to signal over only few cell diameters. The consensus pattern, which is 
based upon a highly conserved region including three cysteines, is as follows: C-K-C- 
H-G-[LIVMT]-S-G-x-C. 

20 Protein Tyrosine Phosphatase (Y phosphatase! SEQ ID NO: HI 7 

represents a polynucleotide encoding a protein tyrosine kinase. Tyrosine specific 
protein phosphatases (EC 3.1.3.48) (PTPase) (Fischer et al., Science (1991) 253:401; 
Charbonneau et al., Annu. Rev. Cell Biol (1992) 5:463; Trowbridge, J. Biol Chem. 
(1991) 266:23517; Tonks et al., Trends Biochem. Set (1989) 74:497; and Hunter, Cell 

IS (1989) J#: 1013) catalyze the removal of a phosphate group attached to a tyrosine 
residue. These enzymes are very important in the control of cell growth, proliferation, 
differentiation and transformation. Multiple forms of PTPase have been characterized 
and can be classified into two categories: soluble PTPases and transmembrane receptor 
proteins that contain PTPase domain(s). Structurally, all known receptor PTPases are 

30 made up of a variable length extracellular domain, followed by a transmembrane region 
and a Oterminal catalytic cytoplasmic domain, PTPase domains consist of about 300 
amino acids. The search of two conserved cysteines has been shown to be absolutely 
required for activity. Furthermore, a number of conserved residues in its immediate 
vicinity have also been shown to be important. The consensus pattern for PTPases is: 

35 [LIVMF]-H-C-x(2)-G^x(3)-[STC]-[STAGP]-x-[LIVMFY]; C is the active site residue. 
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Zinc Fineer. C2H2 Type (Zincfing C2H21 . SEQ ID NOs:308, 807, 
1324, 1503, 1527, 3081, 3193 and 3306 correspond to polynucleotides encoding novel 
members of the of the C2H2 type zinc finger protein family. Zinc finger domains (Klug 
et al., Trends Biochem. Sci. (1987) 72:464; Evans et ah, Cell (1988) 52:1; Payrc et aL, 
5 FEBSLett. (1988) 254:245; Miller et al., EMBQJ. (1985) 4:1609; and Berg, Proc. Natl 
Acad. Sci. USA (1988) 55:99) are nucleic acid-binding protein structures. In addition to 
the conserved zinc Ugand residues, it has been shown that a number of other positions 
are also important for the structural integrity of the C2H2 zinc fingers. (Rosenfeld et aL, 
J. Biomol Struct Dyn. (1993) 77:557) The best conserved position is found four 

10 residues after the second cysteine; it is generally an aromatic or aliphatic residue. The 
consensus pattern for C2H2 zinc fingers is: C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H- 
x(3,5)-H. The two Cs and two H's are zinc ligands. 

Src homology 2 . SEQ ID NOs:186, 2591, 3307 and 3339 represent 
polynucleotides encoding novel members of the family of Src homology 2 (SH2) 

15 proteins. The Src homology 2 (SH2) domain is a protein domain of about 100 amino 
acid residues first identified as a conserved sequence region between the oncoproteins 
Src and Fps (Sadowski L et al., Mot. Cell Biol 5:4396-4408 (1986)). Similar sequences 
are found in many other intracellular signal-transducing proteins (Russel R.B. et aL, 
FEBS Lett. 304:15-20 (1992)). SH2 domains function as regulatory modules of 

20 intracellular signalling cascades by interacting with high affinity to phosphotyrosine- 
containing target peptides in a sequence-specific and phosphorylation-dependent 
manner (Marangere L.E*M, Pawson T., J. Cell Sci. Suppl 75:97-104 (1994); Pawson 
T., Schlessinger J., Curr. Biol 5:434^442 (1993); Mayer BJ., Baltimore D., Trends 
Cell Biol 5:8-13 (1993); Pawson T., Nature 575:573-580 (1995)). 

25 The SIC domain has a conserved 3D structure consisting of two alpha 

helices and six to seven beta-strands. The core of the domain is formed by a continuous 
beta-meander composed of two connected beta-sheets (Kuriyan J., Cowburn D., Cwrr. 
Opin. Struct, Biol 5:828-837(1993)). The profile to detect SH2 domains is based on a 
structural alignment consisting of 8 gap-free blocks and 7 linker regions totaling 92 

30 match positions. 

Src homology 3. SEQ ID NO:234, 1832, and 1835 represent 
polynucleotides encoding novel members of the family of Src homology 3 (SH3) 
proteins. The Src homology 3 (SH3) domain is a small protein domain of about 60 
amino acid residues first identified as a conserved sequence in the non-catalytic part of 

35 several cytoplasmic protein tyrosine kinases (e.g., Src, Abl, Lck) (Mayer B.J. et aL, 
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Nature 552:272-275 (1988)). Since then, it has been found in a great variety of other 
intracellular or membrane-associated proteins (Musacchio A* et al, FEBS Lett 307:55- 
61 (1992); Pawson T., Schlessinger J., Cum Biol. 3:434-442 (1993); Mayer B.J., 
Baltimore D., Trends Cell Biol. 5:8-13 (1993); Pawson T., Nature 575:573-580 (1995)). 
5 The SH3 domain has a characteristic fold which consists of five or six 

beta strands arranged as two tightly packed anti-parallel beta sheets. The linker regions 
may contain short helices (Kuriyan J. 7 Cowbum D.> Curr. Opin. Struct, Biol 5:828-837 
(1993)). 

The function of the SH3 domain may be to mediate assembly of specific 
10 protein complexes via binding to proline-rich peptides (Morton C.J., Campbell IJX, 
Curr Biol *615-617 (1994)). 

In general SH3 domains are found as single copies in a given protein, but 
there are a significant number of proteins with two SH3 domains and a few with 3 or 4 
copies, 

15 Fibronectin type III. SEQ ID NOs:746 and 1192 represent 

polynucleotides encoding novel members of the family of fibronectin type III proteins. 
A number of receptors for lymphokines, hematopoeitic growth factors and growth 
hormone-related molecules have been found to share a common binding domain. 
(Bazan J.F., Biochem. Biophys. Res, Commun. 764:788-795 (1989); Bazan J.F., Proc. 

20 Natl Acad ScL U.S.A. 57:6934-6938 (1990); Cosman D. et al., Trends Biochem. ScL 
75:265-270 (1990); d'Andrea A.D., Fasman G.D., Lodish H.F., Cell 55:1023-1024 

(1989) ; d'Andrea A.D,, Fasman G.D., Lodish H.F., Curr Opin. Cell Biol 2:648-651 

(1990) ). 

The conserved region constitutes all or part of the extracellular ligand- 
25 binding region and is about 200 amino acid residues long. In the N-termtnal of this 
domain there are two pairs of cysteines known, in the growth hormone receptor, to be 
involved in disulfide bonds. 

+ _ __ - xxxxxxx 

30 |C C C C Extracellular XXXXXXX 

~H 1--| xxxxxxx 

II || Transmembrane 

+-+ +--+ 

35 Two patterns detect this family of receptors. The first one is derived 

from the first N-terminal disulfide loop, the second is a tryptophan-rich pattern located 
at the C-terminal extremity of the extracellular region. 
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A consensus for this protein family is: C-[LVFYR]-x(7,8)-[STIVDN]-C- 
x-W (The two Cs are linked by a disulfide bond], A second consensus for this protein 
family is: [STGL]-x-W-[SG]-x-W-S. 

LIM domain containing proteins. SEQ ID NOs:1269, 1309, 1360, and 
5 1386 represent polynucleotides encoding novel members of the family of LIM domain 
containing proteins. A number of proteins contain a conserved cysteine-rich domain of 
about 60 amino-acid residues. (Freyd G. et al. 5 Nature 544:876-879 (1990); Baltz R, et 
al., Plant Cell 4:1465-1466 (1992); Sanchez-Garcia I., Rabbitts T.H., Trends Genet. 
70:315-320(1994)). 

10 In the LIM domain, there are seven conserved cysteine residues and a 

histidine. The arrangement followed by these conserved residues is C-x(2)-C- x( 16,23)- 
H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD]. The LIM domain binds two zinc 
ions (Michelsen J.W, et al, Proc, Natl Acad ScL US, A. 90:4404-4408 (1993)). LIM 
does not bind DNA, rather it seems to act as interface for protein-protein interaction. 

15 The consensus for this protein family is: C-x(2)-C-x(15,21)-[FYWH]-H-x(2>[CH]- 
x(2)-C-x(2)-C-x(3> [LIVMF]. The 5 Cs and the H bind zinc. 

C2 domain fprotein kinase C like). SEQ ID NOs:1325 and 2282 
represent polynucleotides encoding novel members of the family of C2 domain 
containing proteins. Some isozymes of protein kinase C (PKC) contain a domain, 

20 known as C2, of about 116 amino-acid residues, which is located between the two 
copies of the CI domain (that bind phorbol esters and diacy (glycerol) and the protein 
kinase catalytic domain. (Azzi A. et al, Eur, J. Biochem. 205:547-557 (1992); Stabel S„ 
Semin. Cancer Biol. 5:277-284 (1994)). 

The C2 domain is involved in calcium-dependent phospholipid binding 

25 (Davletov B.A., Suedhof T.C., J. Biol Chem. 2tf<3:26386-26390 (1993)). Since 
domains related to the C2 domain are also found in proteins that do not bind calcium, 
other putative functions for the C2 domain include binding to inositol-1,3,5- 
tetraphosphate. (Fukuda M., et al., J, Biol Chem, 26P:29206-2921 1 (1994).) 

The consensus pattern for the C2 domain is located in a conserved part 

30 of that domain, the connecting loop between beta strands 2 and 3. The profile for the C2 
domain covers the total domain. The consensus for this protein family is:: [ACG]-x(2)- 
L-x(2,3)-D-x(l ,2)-[NGSTLIF]-[GTMR]-x-[STAP]-D- [PA]-[FY] 

Serine proteases, trypsin family, active sites. SEQ ID NO:1410 
represents a polynucleotide encoding a novel member of the family of serine protease, 

35 trypsin proteins. The catalytic activity of the serine proteases from the trypsin family is 
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provided by a charge relay system involving an aspartic acid residue hydrogen-bonded 
to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the vicinity 
of the active site serine and histidine residues are well conserved in this family of 
proteases (Brenner S., Nature Ji4:528-530 (1988)). 
5 A consensus for this protein family is: [LIVM]-[ST]-A-[STAG]-H-C [H 

is the active site residue]. A second consensus for this protein family is: PNSTAGC]- 
[GSTAPIMVQH]-x(2>G-[DE]-S"G-[GS].[SAPHV]-[LIVMFYWH] 
[LIVMFYSTANQH] [S is the active site residue]. 

RNA Recog nition Motif Domain fRRM RBD. or RNPV SEQIDNOs: 

10 1464 and 1514 represent polynucleotides encoding novel members of the family of 
RNA recognition motif domain proteins (Bandziulis R.J. et al., Genes Dev. J;43 1*437 
(1989); Dreyfiiss G. et al. 7 Trends Biochem. Sci. 75:86-91 (1988)). 

Inside the putative RNA-binding domain there are two regions which are 
highly conserved. The first one is a hydrophobic segment of six residues (which is 

1 5 called the RNP-2 motif); the second one is an octapeptide motif (which is called RNP-1 
or RNP-CS). The position of both motifs in the domain is shown in the following 
schematic representation: 

xxxxxxx####xxxxxxxxxxxxxxxxxxxxxxxxxxxxx######xxxxxxxx 
20 RNP-2 RNP-1 

As a consensus pattern for this type of domain the RNP-1 motif was 
used. The consensus for this protein family is: [RK]-G-{EDRKHPCG}-[AGSCI]- 
[FY]-[LIVA]-x-[FYLM] 

25 Phosphatidvlinositol-specific phospholipase C. Y Domain, SEQ ID NO: 

1707 represents a polynucleotide encoding a novel member of the phosphatidylinositol- 
specific phospholipase C, Y domain family of proteins. Phosphatidylinositol-specific 
phospholipase C (EC3.1.4.1 1), a eukaryotic intracellular enzyme, plays an important 
role in signal transduction processes (Meldrum E. et al., Biochim. Biophys. Acta 

30 7092:49-71 (1991)). It catalyzes the hydrolysis of l-phosphatidyl-D-myo-inositol- 
3,4,5- triphosphate into the second messenger molecules diacylglycerol and inositol- 
1,4,5-triphosphate. This catalytic process is tightly regulated by reversible 
phosphorylation and binding of regulatory proteins (Rhee S.G., Choi K.D., Adv. Second 
Messenger Phosphoprotein Res. 26:35-61 (1992); Rhee S.G., Choi K.D., I Biol. Chem. 

35 267:12393-12396 (1992); Sternweis P.C., Smrcka A.V., Trends Biochem. ScL 77:502- 
506 (1992)). 
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All eukaryotic PI-PLCs contain two regions of homology, referred to as 
"X-box" and "Y-box". The order of these two regions is the same (NH2-X-Y-COOH), 
but the spacing is variable. In most isoforms, the distance between these two regions is 
only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, 
5 and one SH3 domain are inserted between the two PLC-specific domains, The two 
conserved regions have been shown to be important for the catalytic activity. At the C- 
terminal of the Y-box, there is a C2 domain possibly involved in Ca-dependent 
membrane attachment. 

Serine Carboxypeptidases. SEQ ID NO: 1744 represents a 

10 polynucleotide encoding a novel member of the serine carboxypeptidases family of 
proteins. Carboxypeptidases may be either metallo carboxypeptidases or serine 
carboxypeptidases (EC 3.4.16.5 and EC 3.4,16.6), The catalytic activity of the serine 
carboxypeptidases, like that of the trypsin family serine proteases, is provided by a 
charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, 

15 which is itself hydrogen-bonded to a serine (Liao D.L, Remington SJ., J, Biol Chem. 
265:6528-6531 (1990)). 

The sequences surrounding the active site serine and histidine residues 
are highly conserved in all these serine carboxypeptidases. A consensus for this protein 
family is: [LIVM]~x-[GTA]-E-S-Y-[AG]-[GS] [S is the active site residue]. A second 

20 consensus for this protein family is: [LIVF]-x(2)-[LIVSTA]-x-[IVPST]-x-[GSDNQL]- 
[SAGV]-[SG]-H-x- [lVAQ]-P-x(3)-[PSA] [H is the active site residue], 

dsrm Double-Stranded RNA Binding Motif. SEQ ID NO:1818 
represents a polynucleotide encoding a novel member of the dsrm double-stranded 
RNA binding motif proteins. In eukaryotic cells, a multitude of RNA-binding proteins 

25 play key roles in the posttranscriptional regulation of gene expression. Characterization 
of these proteins has led to the identification of several RNA-binding motifs. Several 
human and other vertebrate genetic disorders are caused by aberrant expression of 
RNA-binding proteins. (C. G. Burd & G. Dreyfuss, Science 265: 615-621 (1994)). 

Proteins containing double stranded RNA binding motifs bind to specific 

30 RNA targets. Double stranded RNA binding motifs are exemplified by interferon- 
induced protein kinase in humans, which is part of the cellular response to dsRNA, 

SEQ ID NOs:2577, 3183 and 3195 encode members of the 4 trans- 
membrane integral membrane protein family. This family consists of type III proteins, 
which are integral membrane proteins that contain a N-terminal membrane-anchoring 

35 domain that is not cleaved during biosynthesis, and which functions as a translocation 
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signal and a membrane anchor. The proteins also have three additional transmembrane 
regions. The consensus pattern is: G-x(3>[LIVMF]-x(2)-[GSA]-[LIVMF] (2)-G-C^ 
[GA]-[STA]-x(20-[eG]»x(20-[CwN]-[LIVM](2). 

SEQ ID NO:2944 encodes a polypeptide having a calpain large subunit, 
5 domain III. Calpains are a family of intracellular proteases that play a variety of 
biological roles. Calpain 3, also known as p94, is predominantly expressed in skeletal 
muscle and plays a role in limb^girdle muscular dystrophy type 2A. (Sorimachi, H. et 
ah, Biochem. J. 328:721-732, 1997). 

SEQ ID NOs:191 1 and 1980 encode polypeptides having a C3HC4 type 

1 0 zinc finger domain (RING finger), which is a cysteine-rich domain of 40 to 60 residues 
that binds two atoms of zinc, and is believed to be involved in mediating protein-protein 
interactions. Mammalian proteins of this family include V(D)J recombination 
activating protein, which activates the rearrangement of immunoglobulin and T-cell 
receptor genes; breast cancer type 1 susceptibility protein (BRCA1); bmM proto- 

15 oncogene; cbl proto-oncogene; and meMS protein, which is expressed in a variety of 
tumor cells and is a transcriptional repressor that recognizes and binds a specific DNA 
sequence. The consensus pattern is: C-x-H-x-[LIVMFY]'C-x(2)-C-[LIVMYA]. 

SEQ ID NO:3274 encodes a eukaryotic transcription factor with a fork 
head domain, of about 100 amino acid residues. Proteins of this group are transcription ' 

20 factors, including mammalian transcription factors HNF-3-alpha, -beta, and -gamma; 
interleukin-enhancer binding factor; and HTLF, which binds to a region of human T- 
cell leukemia virus long terminal repeat. The consensus pattern is [KR]-P-[PTQJ- 
[FYLVQH]-S-[FY]x(2)-[LIVM]-X(3,4)-[AC]-[LIM]. 

SEQ ID NO:3345 encodes a polypeptide having a PDZ domain. Several 

25 dozen signaling proteins belong to this group of proteins that have 80-100 residue 
repeats known as PDZ domains. Several of the proteins interact with the C-terminal 
tetrapeptide motifs X-Ser/Thr/X-Val-COO of ion channels and/or receptors. (Ponting, 
C. P., Protein Sci. 6;464-468, 1997.) 

SEQ ID NO:3351 encodes a polypeptide in the family of phorbol 

30 esters/glycerol binding proteins. Phorbol esters (PE) are analogues of diacylglycerol 
(DAG) and potent tumor promoters. DAG activates a family of serine-threonine protein 
kinases, known as protein kinase C. The N-terminal region of protein kinase C binds 
PE and DAG, and contains one or two copies of a cysteine-rich domain of about 50 
amino acid residues. Other proteins having this domain include diacylglycerol kinase; 

35 the vav oncogene; and N-chimaerin, a brain-specific protein. The DAG/PE binding 
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domain binds two zinc ions through the six cysteines and two histidines that are 
conserved in the domain. The consensus pattern is: H-x-[LIVMFYW]-x(8, 1 l)-Ox(2)- 
C-x-(3)-[LIVMFC]^x(5 > 10)^C-x(2)^C-x(4)-[HD]-x(2)-C-x(5, 9>C 



5 domain. The protein is named for the presence of conserved aromatic positions, 
generally tryptophan, as well as a conserved proline. Proteins having the domain 
include dystrophin, vertebrate YAP protein, and IQGAP, a human GTPase activating 
protein which acts on ras. The consensus pattern is: W-x(9,l l)-[VFY]-[FYW]-x(6,7)- 
[GSTNE]-[GSTQCR]-[FYW]-x(2)-P. 

10 SEQ ID NO:2428 encodes a member of the dual specificity phosphatase 

family, having a catalytic domain, and SEQ IDS NOs:2281 and 2310 encode members 
of the protein tyrosine phosphatase family. These families are related and classified as 
tyrosine specific protein phosphatases. The enzymes catalyze the removal of a 
phosphate group from a tyrosine residue, and are important in the control of cell growth, 



1 5 proliferation, differentiation, and transformation. The consensus pattern is [LIVMFJ-H- 
C-x(2)-G-x^3)-[STCHSTAGP]-x-[LIVMFY]. 



SEQ ID NO:2216 encodes a polypeptide having a WW/rsp5/WWP 
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Table 1 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATIOh 


J CLONE ID 


' ~~ 1 

L!BRARY, 


1 


377044 


RTA00002676F,p.l l.2.P.Seq 




MOOOj9329A.C0I 


PWOQI \/l 


2 


377708 


RTA000O2633F.m_0i.2,P.Set 


1 F 


M000400S94 G08 


fHnoj nji 




427732 


RTA00002666F.1.06.1. P.Seq 




M00032633D:A06 




4 


29372 


RTA000027 1 2F.3.06. 1 .P.Seq 


p 


MOOOTTS' 1 A CO^ 




5 


455003 


RTA00002694 F. b.02 .LP. Seq 


p 


M0004"419DAI0 


Cn-UvUnL V 


6 


380625 


RTA00002684F,d03.2.P.Seq 


p 


M00040I I3D:G10 




7 


450959 


RTA0000269IF.b.05.3. P.Seq 


p 


MD0043306DB07 


v.ni (\-unL v 


3 


397351 


RTA00002680F.b.04, 1 ,P,Seq 


p 


1 M0O039775A A09 




9 


20652 


RTA000027 lOFk.OLI .P,Seq 


F 


M000^440B EOl 




10 


97330 


RTAOOO02663F.k. 18, LP.Seq 


p- 


»M000" >o 767B G 1 1 




1! 


373071 


RTA00002670F.j.23.1.P.Seq 


P 


iVI0003344"* \ D06 


fMOQf Nil 


12 


162369 


RTA000027 1 3 F.e.O 1. 1 .P.Seq 




M000"'7' 1 9' > D'FI0 


PWOJVI Al 


! n 


401247 


RTA00002685F. f. 1 5.2.P.Seq 


p 


M00039*08A CP 


ru 1 ">PHT 


14 


430738 


RTA00002669F. j. 1 5.3-.P.Seq 


F 


M000" *"* > ! H-RfiO 


V~ nUoL.Nn 


15 


46779 


RTA000027IIFx.I4.LP.Seq 




M000" > * T 860C G04 




16 


375772 


RTA00002681F.p.0L2.P.Seq 




MOOl) "J 9909G GO > 


punoi Nil 


17 


430689 


RTA00002669FJ.OL3. P.Seq 




MOOO » VU"*R' aO> 


runil NIW 


IS 


376546 


RTA00002677F.d.07.2.P.Seq 




M000j9"-i>r ri ■> 


r^wooi vi 


19 


430041 


RTA00002667F.H 7. LP.Seq 








20 


431643 


RTA00002669F.M6. 1 P.Seq 




MOOO > ^ "^n- MOQ 


v_ PlUoLiNn 


21 


19422 


RTA00002709F.C.02, LP.Seq 




M 0000 ^449 R RIO 




22 


376S02 


RTA00002677F.C. ! S.2.P Seq | F 


iVt 000 1 9 "J 4-^ R GO 7 


tflUVL.NL 


23 


376814 


RTA0000:674Fh.02 i 1 .P.Seq 








24 


375492 


RTA0000:.6~F.m i !9.2.P.Seq 




MO00«94|SBDOS 


GMOOf Nil 


25 


379! 14 


RTA0000263 1 F.n.24.2.P,Seq 




MO0039903C:F03 


CH0°LN'L 


26 


380668 


RTA00002670F p. 1 LI P.Seq 




M0O0*^^8lCH 10 




27 


213817 


RTA0000266*lF,!.19^P.S*q 




MOOO^^'-iA-DI ! 


CH04MAL 


28 


375749 


RT A 00002 6 80 FT, 2 3. 1 P.Seq 




M00039795D:G06 




29 


430896 | 


RrA00002669F,b,20.4.P.Seq 




M00033135C:D01 


CHOSLNH 


30 


380462 


RTA00002670F.O.0 1. 1. P.Seq 




M0O0"3^^OB F06 




31 


430396 


RTA00002669F.b.20.3. P.Seq 




MOOO^^lS^CDOi 


CHOSLNH 


32 


376996 . 


RTA0000:676F.p.l3.2.P.Seq 




M00039329C:BIO 


CHO^LNL 


33 


374346 


RTA00O02(>77F.k. 1 9.2.P.Seq 




M00039412D:G06 


CHO^LNL 


34 


379075 


RTA00002672Fn.l3.2P.Seq 




M000390;^B:E03 


CH0 Q LNL 


35 


374172 


RTAG0002673F.k.l6.2.P.Seq 




M0003909"D:D06 


CH09LNL 


36 


373104 


RTA00002633F.O. 1 5.2.P.Seq 




M0004O0^SD:GI2 


CH0°LNL 


37 


186302 


RTAOOOOZ 7 1 3 F. m .2 1 . ! .P.Seq 




M000275OJ8C04 


CHO-^MAL 


38 


427947 


RTA00002665F.O.0 1 . 1 .P.Seq 




M000324O5B:D02 


CHOSLNH 


39 


375180 


RTA00002673F.d.l 7.1. P.Seq 




M0003906iD:H09 


CH0°LNL 


40 


377584 


RTA00002633F.1.22.2. P.Seq 




M000400S8C.EIO 


CH0OLNL 


41 


377364 


RTA0000267SF.a.l5,2.P.Seq 




M0C03943:c-A0! 


CH0°LNL 


42 


37634" 


RTA00OO2675F.L08. LP.Seq 




M000392J^C:G!I 


CH0°LNL 


43 


446747 


RTAOOO026S*F.d. 16.2, P.Seq 




M00042740A:E09 


CHS5CON 


44 


28092 


RTA000C1 IF... 12.1. P.Seq \ 




MOO023032A:B05 


CH03MAH 


45 


378206 


RTA000O267IF.J.2O.3 P.Seq 




M000j35SSC:G04 


CH0°LNL 


46 


373206 


RTAOOOO:cTlFa.20.1P.Seq 




M000335SSC;G04 


CH0°LNL 


47 


14940 


RTA0C0ii2"09F.J.l l.f P.Se'j 


F | M0000J623A:G02 | CH02COH 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


1 CLONE ID 


LIBRARY 


48 


37841 1 


RTA00002672F.g.!3.2.P.Seq 


F 


M00039004B:A06 


CH091 NL 1 


49 


38120 


RTA000027 1 2F.U 4. 1 .P.Seq 


F 


M00026927D:F02 


CH04MAL 


50 


375730 


RTA00002678F.U3.2,P.Scq 


F 


M000396I2B:G05 


CH09[ NIL 


51 


428959 


RTA00002667F.h. 1 5. 1 .P.Seq 


F 


M000328I IB:D02 


CH08L NIH 


52 


376851 


RTA00002677F.C.03.2, P.Seq 


F 


M00039341C:H1 1 


CH091 NL 


53 


373808 


RTA0000267IF.d,l4.2.P,Seq 


F 


M00038272A;G0l 


CH09LNL 


54 


376168 


RTAQ0002675F.n. 1 7. 1 .P.Seq 


F 


M00039258B;E06 


CH09LNL 


55 


18653 


RTA000027l2F.o.08.I.P.Seq 


F 


M00027I35A:B1 1 


CH04MAL 


56 


187632 


RTA00002664F.i. 15.1 .P.Seq 


F 


M000276I7B:CI2 




57 


374122 


RTA00002673F.I.22. 1 .P.Seq 


F 


M00039104D:C09 


CH09LNI 


58 


374946 


RTA00002673F j.24. 1 .P.Seq 


F 


M00039096A E07 

I" m V War r\A 7vri » 4^ U f 


CH09I MI 


59 


375666 


RTA00002677F.n. !6.2.PSeq 


F 


M00039422D:F04 


CH09LNL 


60 


162369 


RTA0000271 3F.d.24. 1 .P.Seq 


F 


MOOO J> 729' ,, DF10 


TH04VIA1 


61 


21480 


RTA00002709F.C 1 8.2. P.Seq 


F 


M0000553 ID-F06 




62 


18560 


RTA000O27 1 1 F.e.20. 1 P.Seq 


F 


M000" n 938BF07 

V 1 uuv » 'JOU.l v / 


v. nyj ivirt n 


63 


96575 


RTA00002663F j.08. 1 P.Seq 


F 


i»iuuu_«o4 i \~ .nuj 




64 


377576 


RTA00002682F.F. 1 8. 1 P.Seq 


F 


M0003997*C-ri I 




65 


446747 


RTA00002639F.ti 16.3. P. Seq 


F 


M0004"* 740 A • F09 


n i ^?v_ win 


66 


379311 


RTA00002682F.g.O 1 . 1 . P.Seq 


F 


M000»9976D- A P 


THOQf Ml 


67 


37931 1 


RTA00002682F.r.24. 1 P.Seq 


F 


M000^9976D ■ \ P 


rUAQI MI 


68 


124549 


RTA000027 1 3F.C.07. 1 .P.Seq 


F 


M000" 1 7"' , 57C , B08 




69 


449785 


RTA0000269 1 F.c.OT.J.P.Sea 


F 


M00043^4^C- A06 


CH17rni-fI V 


70 


375134 


RTA00002673F.k.22. \P.Seq 


F 


M000"9099A H0S 


CHQOI Ml 


71 


1 86593 


RTA000027 1 3F.n. \5A .P.Seq 


F 


M000'*76*>0D"FI I 


TH04 VI A I 

V_, 1 1 v*t iVirt L. 


72 


44983 1 


RTA0000269I F.a. 1 7.3. P.Seq 


F 


M0004°5I8D' A06 


CH 17COHT V 


73 


379678 


RTA00002676F.b.06. 1 .P.Seq 


F 


MOOO^^B G07 


CHO^I NI 


74 


20599 


RTAO00O_708F.h.06. 1 .P.Sea 


F 


^00004^646 A0-> 


CH01COH 


75 


411 15 


RTA00002713F,o.M.I.P.Seq 


F 


M000"*76^" , B FI i 


CH04MAL 


76 


21109 


RTA00002708F.hJ 2.1, P.Seq 


F 


M0000427$A:F09 


CH0ICOH 


77 


455702 


RTA00002694F.b. 11.1. P.Seq 


F | 


M00043433CG07 


CH^OCOHLV 


78 


380643 


RTA00002683F,p.09.2.P.Seq 


F 


M00040103B:H10 


CH09LNL 


79 


374413 


RTA00002672F.L l5.2.P.Seq 


F 


M00039015B:G10 


CHO^LNL 


80 


378891 


RTA00002672F.i,18.2.P.Seq 


F 


M000390I6A:A02 


CH09LNL 


81 


379374 


RTA00002672F.k.ii.2.P.Seq 


F 


M00039028C:BII 


CHO^LNL 


82 


17253 


RTA000027G9F.h.23.l. P.Seq 


F 


M00006866A:D07 


CH02COH 


83 


21565 


RTA00002709F.e.l 1.1. P.Seq 


F 


M0O005773B:F09 


CH02COH 


84 


373996 


RTA00002673F.n. 11. 1. P.Seq 


F 


M00039108D:B06 


CH09LNL 


85 


380437 


RTA00002683F.c.09KP.Seq 


F 


M00040039D:D06 


CH09LNL 


86 


430729 


RTA0000:669F.h.l8.:,P5eq 


F 


M00033226A.AI I 


CHOSLNH 


87 


376791 


RTA00002674F.1. 17,1. P.Seq 


F 


M00039I66B:G06 


CH09LNL 


88 


373760 


RTA00002672F.p.20. 1 .P.Seq 


F 


M00039049D:G07 


CH09LNL 


89 


373S37 


RTA00002672F.p.22.LP.Seq 


F 


M00039050A:HIO 


CH09LNL 


90 


376435 


RTA00002678F.fi. 17.2. P.Seq 


F 


M00039476B:A02 


CH09LNL 


91 


373881 


RTA00002672F.b20. 1 .P.Seq 


F 


M00038638D:H03 


CH09LNL 


92 


377086 


RTA00002676F.p.07. 1 .P,$eq 


F 


M00039328D:D07 


CH0°LNL 


93 


377S89 


RTA00002672F.C.08.1. P.Seq 


F 


M00038661A;A07 


CH09LNL 


94 


380442 


RTA00002684F,b.05.:.P t Seq 


F 


M00040I!!C:D05 


CH0^>LNL 



no 
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CLUSTER 


SEQ NAME 


ORJENTATIOlv 


1 CLONE ID 


LIBRARY 


95 


374689 


RTA00002676F.m. (3.2.P.Seq 


F 


M000393I8B:B09 


CH09LNL 


96 


375339 


RTA00002678 F.m .23 .2 . P.Seq 


F 


M00039616A:BIO 


CH09LNL 


i 97 


14197 


RTA000027 lOF.f. 15. 1 .P.Seq 


F 


1 M00022084D:B0I 


CH03MAH 


98 


380666 


RTA00002684F.C.04.2. P.Seq 


F 


| M00040II5B:HI2 


CH09LNL 


99 


377352 


RTA00002677F.1. 13.2,P.Seq 


1 F 


M00039404B:A05 


CH09LNL 


100 


379188 


RTA00002682F.a,03. 1 .P.Seq 


F 


M00039914D:GI2 


CH09LNL 


101 


428269 


RTA00002666F.C. 1 3, 1 .P.Seq 


F 


MOQ032539B:CI1 


CH08LNH 


102 


373464 


RTA0000267I F.LI 3.3, P.Seq 


F 


M00038327A;CII 


CH09LNL 


103 


15527 


RTA000027 1 OF.p.07. 1 .P.Seq 


1 F 


M00022747D:E03 


CH03MAH 


104 


377504 


RTA0000267 1 Fx 1 7.3.P.Seq 


F 


M00038303CD02 


CH09LNL 


105 


33508 


RTA000027lOF.g. 1 7. 1 P.Seq 


F 


M0O022l83B:C02 


CH03MAH 


106 


129179 


RTA00002662F.d. !9.2.P.Seq 


F 


M00007I57CFII 


CH02COH 


107 


377086 


RTA00002676F.p.07.2.P.Scq 


F 


M00039328D:D07 


CH09LNL 


108 


375872 


RTA00002675 F.h. 1 5. 1 . P.Seq 


F 


M00039233A:A03 


CH09LNL 


109 


375652 


RTA00002676F.i.07.3.P,Scq 


F 


M00039303C:F1I 


CH09LNL 


no 


374266 


RTA00002674F.i.08.2.P.Seq 


F 


M00039I44C:E06 


CH09LNL 


111 


378983 


RTA00002682F.3.07.I. P.Seq 


F 


M000399I5D:CII 


CH09LNL 


112 


377343 


RTA00002684F.g.04. 1 .P.Seq 


F 


M00040302C:A04 


CH09LNL i 


113 


378679 


RTA0000268IFX l6,2.P.Scq 


F 


M00039869B:F06 


CH09LNL 


U4 


374095 


RTA0000267 1 F.p.08 2 P.Seq 


F 


M000386I8C:C08 


CH09LNL 


115 


375843 


RTAOO0O267 1 F.o.06.2. P.Seq 


F 


iM000386l4C:Hll 


CHO^LNL 


116 


377788 


RTA000O2684F.h.01 .2,P.Seq 


F 


M00040305C:H06 


CH09LNL 


117 


21403 


RTA00OO2709F.J.05. 1 .P.Seq 


F 


M0000692SD:D07 


CK02COH 


11$ 


23 1 84 


RTA00002709F.b.05.2,P.Seq 


F 


MO0005358B:B06 


CH02COH 


119 


15671 


RTA000027lOF.k.l6.LP,Seq 


F 


M00022495D:H08 


CH03MAH 


120 


177367 


RTA00002663F.rn.22. 1 .P.Seq 


F 


M00022986D:H09 


CH03MAH 


121 


377788 


RTA00002684F g.24. 1 .P.Seq 


F 


M00040305OH06 


CH09LNL 


122 


375058 


RTA00002675F.h.02. 1 .P.Seq 


F 


M00039250D:G!2 


CH09LNL 


123 


380412 


RTA00002680F.k.l5.2.P.Seq 


F 


M000398I6B:D04 


CH09LNL 


124 


178447 


RTA00002663F.n.06. ! .P.Seq 


F 


M000230O7A:H04 


CH03MAH 


125 


376647 


RTA00002674F.h.07. 1 .P.Seq 


F 


M00039140D:D09 


CH09LNL 


126 


44679 


RTA0000266IF.C. 19.1. P.Seq 


F 


M00003800A:F09 


CH0ICOH 


127 


377659 


RTA00002678F.a.04.2.P,Seq 


F 


M0003^430B:F12 


CH09LNL . 


128 


379703 


RTA00002682F.h.03,!.P.Seq 


F 


M00039982C;H04 


CH09LNL 


129 


374673 


RTA00002673F,e,03.2. P.Seq 


F 


M00039068B;B04 


CHQ^LNL 


130 


205(3 


RTA000027!0F.j.|2.1.P,Seq 


F 


M00022391D:F10 


CH03MAH 


131 


376124 


RTA00002682F.n. 1 7. 1 .P.Seq 


F 


M0004002IA:F09 


CH09LNL 


132 


374679 


RTA00002676F.d.07.2, P.Seq 


F 


M000392$ID:B04 


CH09LNL 


133 


23184 


RTA00002709F.b.05. 1. P.Seq 


F 


M00005358B:306 


CK02COH 


134 


430953 


RTA00002668F.i.23. 1 P.Seq 


F 


M00033007C:E0I 


CH08LNH 


i j j 




PTAAflflfPAQJF k 1 D Ci.^ 

i\. i /\uuuu.oo-+r.D.Lij. i .r.oeq 


F 


1 (AAA 1 A 1 1 1 y~* f~*v j\ «« 

M00040I I IC:D0:> 


CH09LNL 


136 


12374 


RTA00002709F.a.0 1.1. P.Seq 


F 


M00004825D:D05 


CH02COH 


137 


427466 


RTA00002665F.b.l I.S.P.Seq 


F 


M0002S184D:GI0 


CK0SLNH 


138 


3661! 


RTA00002668FI\03.LP i Seq 


F 


M00032942D:CI2 


CH08LNH 


139 


33756 


RTA00002662F.a, 182, P.Seq 


F 


M00005359A:D04 


CH02COH 


140 


456026 


RTA00002694F,e,03. 1. P.Seq 


F 


M000436I6C:A05 


CH20COHLV 


141 


15766 


RTA00O027!0F,k.O2.1. P.Seq 


F 


M00022444D:G0I 


CH03MAH 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


142 


24352 


RTA00002709F .a.05. 1 P.Seq 


F 


M00004839CH02 


CH02COH 


143 


24354 


RTA00002709F.a,03, 1 -P.Seq 


F 


M00004832D:H02 


CH02COH 


144 


379114 


RTA0000268lF.o.0I.2.P.Seq 


F 


M00039903CF03 


CH09LNL 


145 


19609 


RTA00002709F.C.05. 1 P.Seq 


F 


M00005457CA03 


CH02COH 


146 


21685 


RTA00002709F.e.23.1. P.Seq 


F 


M0000658ID:F08 


CH02COH 


147 


380085 


RTA00002682F.1. 10. 1. P.Seq 


F 


M00039987A:F09 


CH09LNL 


148 


20700 


RTA00002710F.L 1 8. 1 .P.Seq 


F 


M00022373A:B05 


CH03MAH 


149 


379981 


RTA00002682F.L 1 8. LRSeq 


F 


M00039988A:E06 


CH09LNL 


150 


37659! 


RTAOOOO2675F.C.0 1 . 1 P.Seq 


F 


M00039213A:DOI 


CH09LNL 


151 


92058 


RTA00002663F.m,04. 1 P.Seq 


F 


M0O022895A:H08 


CH03MAH 


152 


196936 


RTA00002663F.m.02. 1 .P.Seq 


F 


M00022885CH05 


CH03MAH 


153 


430702 


RTA00002668F.H.04. 1. P.Seq 


F 


M00032990B:A1I 


CH08LNH 


154 


378448 


RTA00002680F,n.2 1 .2.P.Seq 


! F 


M00Q39832A:B!2 


CH09LNL 


155 


41606 


RTA00002713F.e.I0.1. P.Seq 


F 


MO00273OIA:GO5 


CH04MAL 


156 


213817 


RTA00002664F.L 19. 1 . P.Seq 


F 


M00027634A:DM 


CH04MAL 


157 


373464 


RTA00002671F.L 13.1. P.Seq 


F 


M00038327A:C1I 


CH09LNL 


153 


379483 


RTA00002679F.k. 12,1. P.Seq 


F 


M00039700B:D02 


CH09LNL 


159 


375796 


RTA00002680F.f. 17.1. P.Seq 


F 


M000397953:HIO 


CH09LNL 


160 


375796 


RTA00002680F.f.l7.2.P.Seq 


F 


M00039795B:H10 


CH09LNL 


161 


120485 


RTA00002663F.b. 12.1. P.Seq 


F 


M00021665B:F12 


CH03MAH 


162 


374291 


RTA00002673F.f.I7I.P.Seq 


F 


MO0039O72C:E02 


CH09LNL 


163 


380513 


RTA00002677F.pJ5.2.P,Seq 


F 


M00039423C:E0! 


CH09LNL 


164 


379416 


RTA00002683F.J.07.2. P.Seq 


F 


M00040077D:C1I 


CH09LNL 


165 


378178 


RTA00002680F . 1.13, LP. Seq 


F 


M0003982OA:Fll 


CH09LNL 


(66 


427947 


RTA00002665F.n.24. ! .P.Seq 


F 


M00032495B:D02 


CH08LNH 


167 


427269 


RTA000O2665F.d03.3,P.Seq 


F 


M00028212C:B08 


CH08LNH 


168 


20451 


RTA000027lOF.j.l0.l.P.Seq 


F 


M0002239!8:EOI 


CH03MAH 


169 


377003 


RTA00002683F.g.0^.2.P.Seq 


F 


M00040062B:305 


CH09LNL 


170 


427759 


RTA00002665F,o.lU. P.Seq 


F 


M00032499C:A0I 


CH08LNH 


171 


427549 


RTA0000^668F.k. 13.1. P.Seq 


F 


M0003303^C:A06 


CH08LNH 


172 


373881 


RTA00002672F.b,20.2.P.Seq 


F 


MO0038638D.HO3 


CH09LNL 


173 


188215 


RTA00002664F.M3 2.PSeq 


F 


MO002720OA:F02 


CH04MAL 


174 


379683 


RTA0000268IF.d.042.P.Seq 


F 


M00039857B;G10 


CH09LNL 


175 


380652 


RTA00002678Fd.l22.PSeq 


F 


M00039455D;H04 


CH09LNL 


176 


378334 


RTA00002679F.h.!O.I i P,Seq 


F 


M00039682C:HI 1 


CH09LNL 


177 


377930 


RTA00002680F.g. 14. 1 .P.Seq 


F 


M00039798B:B02 


CH09LNL 


178 


378692 


RTA00002680F.O.20.3. P.Seq 


F 


M00039835A:F07 


CH09LNL 


179 


32279 


RTA00002709F.d.23J.P.Seq 


F 


M00005673B:312 


CH02COH 


180 


376379 


RTA00002680F.C. If. 1 .P.Seq 


F 


M00039782A.:HI0 


CH09LNL 


181 


375963 


RTA000F>675F.U2. LP.Seq 


F 


M00039238A.B12 


CH09LNL 


182 


378683 


RTA00002680F.a. 14.2.P.Seq 




M00039773D:A09 


CHQ9LNL 


183 


374946 


RTA000026 73 Fj .24 . 2 . P . Seq 




M00039096A:E07 


CHOQLNL 


184 


429583 


RTA00002666F.g.lO. 1 P.Seq 




M000325S4A:H08 


CH08LNH 


185 


28338 


RTA000027IIF.e.r.l. P.Seq 




M00022930CE02 


CH03MAH 


186 


427970 


RTA00002665Fj. 13.1. P.Seq 




M0003136SA:E10 


CHOSLNH 


187 


379650 


.RTA00002683F.h.^.\P.Seq 




M000400#2C:G09 


CH09LNL 


188 


379661 


RTA00002676F.C.O? 2. P.Seq 




MO0039277D:GI0 


CHO^LNL 
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189 


376182 


RTA00002677F.b. 1 7.2. P.Seq 


F 


M000393408:E07 


CH09LNL 


190 


374797 


RTA00002678F.b i 12.2.P.Seq 


F 


M00039444CH02 


CH09LNL 


191 


375389 


RTA00002674F.a, 13. 1 P.Seq 


F 


M0003912QCC09 


CH09LNL 


192 


397115 


RTA00002683F.i.22.2.P.Seq 


F 


M00O40076CD06 


CH09LNL 


193 


186655 


RTA00002712FJ.2LLP.Seq 


F 


M0002694ID:A04 


CH04MAL 


194 


404682 


RTA00002687F.b. 1 3. 1 .P.Seq 


F 


M00039766D:H01 


CH14EDT 


195 


19609 


RTA00002709F.C.05.2. P.Seq 


F 


M00005457CA03 


CH02COH ! 


196 


404682 


RTA00002687F.5. 13.2.P.Seq 


F 


M00039766D:H01 


CH14EDT 


197 


380412 


RTA00002680F.k. 1 5. LP.Seq 


F 


M00039816B:D04 


CH09LNL 


198 


394413 


RTA00002689F.d. 1 7.3.P.Seq 


F 


M00042742D:D05 


CH15CON 


199 


380086 


RTA00002679F.m. 16. 1 .P.Seq 


F 


M00039710C:G03 


CH09LNL 


200 


430738 


RTA00002669F.U5-2.PSeq 


F 


M00033231D:S09 


CH08LNH 


201 


40667 


RTA000027 1 2F.g.22. LP.Seq 


F 


M00026882D:G09 


CH04MAL 


202 


397421 


RTA0000268 1 Fx. 1 6.2.P.Seq 


F 


M00039854B:FO9 


CH09LNL 


203 


398775 


RTA00002679F.f. I L LP.Seq 


F 


M00039675D.H05 


CH09LNL 


204 


87345 


RTA000027I2FT. 19. LP.Seq 


F 


M00026850D:F09 


CH04MAL 


205 


379920 


RTA00002679F.b,24.2.P.Seq 


F 


M00039660C:CI0 


CH09LNL 


206 


380666 


RTA00002684F.C.04. 1 .P.Seq 


F 


M00040II5B:H12 


CH09LNL 


207 


404340 


RTA00002687F.b.05.2.P.Seq 


F 


M00039764CD07 


CHT4EDT 


208 


375509 


RTA00002680F.e,08.2,P.Seq 


F 


M00039790B:D03 


CH09LNL 


209 


46423 


RTA000027 1 2F.I.02. LP.Seq 


F 


M000269I4A:HI0 


CH04MAL 


210 


401713 


RTA00002685 F.p. 1 0.2. P.Seq 


F 


M00039647A:Htl 


CH12EDT 


211 


377003 


RTA00002683F^09. 1 P.Seq 


F 


M00040062B;B05 


CH09LNL 


212 


378891 


RTA00002672F1 18. LP.Seq 


F 


M000390I6A:A02 


CH09LNL 


213 


412778 


RTA00002685F.L07.2.P.Seq 


F 


M00039533D:F04 


CH12EDT 


214 


373786 


RTA00002679F.a.20.2.P.Seq 


F 


M00039655C:C07 


CH09LNL 


215 


378692 


RTA00002680F.o.20.2.P.Seq 


F 


M00039835A:F07 


CH09LNL 


216 


88888 


RTA00002713F.f.22.t.P.Seq 


F 


MOO027355A:BO7 


CH04MAL 


217 


358187 


RTA00002676F.b.04.2.P.Seq 


F 


MO0039273D:BO2 


CH09LNL 


218 


377131 


RTA00002682F i e i 10. LP.Seq 


F 


M00039933C:E1I 


CH09LNL 


219 


21488 


RTA00002703F.f. I7J P.Seq 


F 


M00004I52A:C12 


CHOICOH 


220 


447487 


RTA00002689F.e.04 ,3 . P. Seq 


F 


M00042895A:D10 


CHI5CON 


221 


364 


RTA000027 1 OF.a,06. 1 .P.Seq 


F 


M00007929C:B08 


CH03MAH 


222 


404024 


RTA00002687F.e.l8.2.P.Seq 


F 


M00039958A:A08 


CHI4EDT 


223 


152305 


RTA000027 1 2 F.d02. LP.Seq 


F 


M00023376B:G04 


CH04MAL 


224 


106050 


RTA000027 1 3 F.o. 1 1. 1 .P.Seq 


F 


M00027668C:H(2 


CH04MAL 


225 


41126 


RTA000027 1 3 F. U 2. 1 . P.Seq 


F 


M0O027546C:BI0 


CH04MAL 


226 


113496 


RTA000027 13F.n.20. LP.Seq 


F 


M00027625A:H0I 


CH04MAL 


227 


4474S7 


RTA00002689F.e.04, LP.Seq 


F 


M00042895A.DI0 


CH15CON 


228 


146335 


RT A000027 !2Fj. 17, LP.Seq 


F 


M00026980A:D09 


CH04MAL 


229 


376647 


RTA000026/4F.h.Q/.2.P.Seq 


F 


M00039I40D:D09 


CH09LNL 


230 


376746 


RTA00002674F.L12.2.P.Seq 


F 


M00039I33B:F08 


CH09LNL 


231 


373523 


RTA00002674F.n.2 1 .2.P.Seq 


F 


M00039177B:D03 


CH09LNL 


232 


455466 


RTA00002694F.C. 10. LP.Seq 


F 


M0004346ID:E06 


CH20COHLV 


233 


374031 


RTA00002683F.p.. 17.2. P.Seq 


F 


MOO04OIO5OFM 


CH09LNL 


234 


373997 


•RTA00002673F.m.04.2. P.Seq 


F 


MOOOJ9I05CB08 


CH09LNL | 


235 


455717 


RTA00002694F.a.06. LP.Seq 


F 


M00042593C:C06 


CH20COHLV 



73 
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SEQ 

i ^ 


CLUSTER 


SEQ NAME 


ORJENTATIO 


\ CLONE ID 


LIBRARY 


236 


373837 


RTA00002672F.p.22.2.RSeq 


F 


M00039O50A:H10 


GH09LNL 


237 


374513 


RTA00002672F.U62.RSeq 


f F 


M00039015B:H09 


CH09LNL 


238 


375628 


RTA00002672F.k.04.2.RSeq 


F 


M00059026D:F05 


GH09LNL 


239 


377732 


RTA0000268 1 F.p.09. 1 .RSeq 


F 


M000399!OC:G!0 


CH09LNL 


240 


j_ 378326 


RTA 0000268 1 F.m.l LLRSeq 


F 


M00039896C:H01 


CH09LNL 


241 


378001 


RTA0000268 1 F.m.22. 1 .RSeq 


F 


M00039898D:C06 


CH09LNL 


242 


378459 


RTA0000268 1 F.i.07,2,P,Seq 


F 


M00039879D;B1I 


CH09LNL 


243 


373862 


RTA0000267 1 F.g.O 1 ,2.P.Seq 




M00038284B:H04 


CH09LNL 


244 


373252 


RTA00002670F.k. 16. I.RSeq 




M0003345!A:H0I 


CH09LNL 


245 


378475 


RTA0OO02672F,g.24J.RSeq 


F 


M00039006D:B01 


GH09LNL 


246 


379941 


RTA00002682Fj.I5JP.Seq 


F 


M00039990C:D10 


CH09LNL 


247 


427703 


RTA0OOO2665F.eJIJ.RSeq 


F 


M00028357A:GI0 


CHOSLNH 


248 


373976 


RTA0000267IF.pJ5.2.RSeq 


F 


M000386I9B:A03 


CH09LNL 


249 


431643 


RTA00002669F.1. l6,3,RSeq 


F 


M00033276D:H09 


CH08LNH 


250 


383502 


RTA00002670F.k.07. 1 .P.Seq 


F 


M00033446D:B02 


GH09LNL 


251 


378764 


RTA0000268 1 Fj.04. 1 .RSeq 


F 


M00039884A.H1 1 


CH09LNL 


252 


431629 


RTA00002669FJJ4.3.P.Seq 


F 


M00033276B:G08 


CH08LNH 


253 


372992 


RTA00O02671F.b.!6.2.RSeq 


F 


M00033594C:B03 


CH09LNL 


254 


431601 


RTA00002669F.k.08.3.RSeq 


F 


M00033263B:G04 


CH08LNH i 


255 


21059 


RTA000027 1 OF.c.05. 1 .P.Seq 




M00008053A:FIO 


CH03MAH 


256 


430689 


RTA00002669F. i.24.3 . P.Seq 




M0O033243B:AO5 


CH08LNH 


257 


131764 


RTA00002662F.C. 14.1. P.Seq 




M00006893C:E07 


CH02COH 


258 


373300 


RTA00002674F.C.2 1 2.RSeq 




M00039126D:A08 


CH09LNL 


259 


38460! 


RTA00002670F.R.06. I.PSeq 




M00033446C:G08 


CHOPLNL 


260 


375389 


RTA00002674F.a. 1 3.2.RSeq 




M00039120C;C09 


CHO^LNL 


261 


15248 


RTA000027 1 OF. f" 23 . 1 .P.Seq 




MO00::i27C;HO3 


CH03MAH 


262 


428134 


RTA00002666FX. 15. 1. RSeq | F 


M00032540A:A09 


CHOSLNH 


263 


374184 


RTA00OO2672F.a, ! 9. 1 .RSeq 




M00038633A:D07 


CH09LNL 


264 


136225 


RTA00002676F.n.02.2.RSeq 




M000393I9C:A04 


CH09LNL 


265 


401713 


RTA00002685F.p. 1 0. 1 .RSeq 




M00039647A:H1I 


CH12EDT 


266 


27104 


RTA0000266 1 F.a.09. 1 P.Seq 




M00001 363 D:D09 


CH01COH 


267 


207466 


RTA00002664F.j.08.2.RSeq 




M00027733A:A02 


CH04MAL 


268 


143045 


RTA00002663F.a.02J.RSeq 




M00007941D:C09 


CH03MAH 


269 


378830 


RTAOOOO2675F.e.07.1.RSeq 




M00039221A;H03 


CH0°LNL 


270 


21731 


RTA00OO27O9F.k.O7. ! P.Seq 




M00007013AD09 


CH02COH 


271 


428552 


RTA00002666F.C. 1 6. 1 RSeq 




M0003254ID:H08 


CHOSLNH 


272 


187632 


RTA00002664F.i. 1 5.2.RSeq 




MOO02T617B:C12 


CH04MAL 


273 


431053 


RTA00002668F.o.05.2.RSeq 




M00033l3OB:FO6 


CHOSLNH 


274 


188972 


RTA00002664F.d.2(L LP.Seq 




M00027030C.H06 


CH04MAL 


275 


430678 


RTA00002668F.hJ21.RSeq 




M00032994A:A08 


CHOSLNH 




l7,iA.n 
J /4U4.I 


RTA00002672F.a.OS. 1 .RSeq 




M0003863ICBI0 


CHO°LNL 


111 


24332 


RTA00002709F.j.07J.RSeq 




M00006935CF06 


CH02COH 


278 


376764 


RTAO0OO2674FT.2OJ.RSeq 




M00039135D;F05 


CHO°LNL 


279 


13433$ 


RTA00002662F.cJ5.2P.Seq 




M00006S97A:H02 


CH02COH 


280 


375541 


RTA00O0268OF.d.2 1 .2. RSeq 




M00039788A:E03 


CHO^LNL 


281 


228909 


RTA00OO2664F.e.08.2.RSeq 




M00027085C:E11 


CH04MAL 


282 


58063 


RTA0000266IF.hJ8J.RSeq 




M00004234A:E07 


CHOICOH 



14 
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ID 


CLUSTER 


SEQ NAME 


ORJENTATIO 


\ CLONE ID 


I IRR A R V 


283 


380500 


RTA00002670F,p.l9,2.P.Seq 


F 


M0OO33'>83RFO6 




284 


34928 


RTA000027 1 0F,p2 1 . 1 .RSeq 


F 


MOOO^ *> 7 9 * R • C.Oft 


PWAlNil A Li 


285 


374028 


RTA00002674F.k.03.2.P.Seq 


F 


I'lVUwJ r 1 rfUn^O 1 1 


PWAQI mi 


286 


374121 


RTA00002672F.h.22.2.P.Seq 


F 




LH09LNL 


287 


429547 


RTAO00O2668F.C.07, 1 P.Seq 


F 




urtuoLiNri 


288 


380668 


RTA00002670F.P- 1 1 .2P.Seq 


j F 


M00O33^^irHlft 




289 


258704 


RTA00002665F.m .06, 1 . P.Seq 


» 






290 


380325 


RTAOOO02670F.p.2"\2. P.Seq 


F 


moooi * * rv ro* 




291 


378326 


RTA0000268 1 F.m. ! 1 .2.P.Seq 


F 






292 


375618 


RTA00002675F,d. l3.LP.Seq 


F 


M0001Q"* 1 ft A ■ Pft'5 


PHOOf MI 


293 


20999 


RTA00002709FJ. 16.1. P.Seq 


F 


MOOOO£ 0 7 » CftA 


fI4A"i|T\LI 


294 


29102 


RTA000027 1 OF.p. 1 8, 1 .P.Seq 


F 




nuj ivi/\ri 


295 


379334 


RTA00002680F.b.22, 1 .P.Seq 


F 


moooiq77rp* aaj 




296 


23943 


RTA00002709F.i. 12.1. P.Seq 


F 

r 


iviuuuuoooOLJ. nU- 




297 


373998 


RTA0O002672F.a. !0.2.P.Scq 


F 


moooi^a*; i rvRAi 




298 


373325 


RTA00002672F.C, I4.2.P.Seq 


F 


MUUUJoOO_D.A 1- 


runoi mi 


299 


373818 


RTA00002672F,e. 1 5.2, P.Seq 


F 


\A AAA i loo >\ r • nn* 




300 


429843 


RTA00002668F c.10 1 PSea 


F 

r 


X/lAAA^Q 1 QB« CAA 


rUrtOI MLI 


301 


427755 


RTA00002665F.d. 19.3. P.Seq 


F 


K*inAA"»9"; 1 AR M 1 1 


rtiftsi MLI 


302 


189177 


RTA00002664F,c.23 2,P,Seq 


F 


\A AA A T AQ'? *? C • I" A 1 


L.nU4i>iAL 


303 


13294 


RTA00002709F.J, 1 5. 1 ,P.$eq 


r 


\ylAAAAAQA9 A Y^A« 




304 


178801 


RTA00002663 Fn.OU. P.Seq 


F 

r 


MAAA^^OQT A -Fn/; 

jviui/i/— v / .-v.ruo 




305 


230865 


RTA0000^664F d 03 "> P Sea 


c 

r 


V.I AAA*?AO"»9n- An". 


run t VI \ f 


306 


178801 


RTA00002663F.m.24.1. P.Seq 


c 

r 


.\jfAAA' > " > 007 A >PA/i 




307 


378809 


RTAQ000267T g "> 1 "> P Sea 


F 
r 


MAAA10nA>r**UAI 




308 


378957 


RTA00002670F.d 1 7 1 P Sea 


F 

r 


i\zfnnn^n^"*r , rA^ 




309 


373523 


RTA00002674F.n.2 1 . 1 .P.Seq 


F 


N/fAAA 1 77R'HA^ 
IVlUUUJ"! / JQ.LAJJ 




310 


375458 


RTA0OOQ2673FJ.O6,2-P.3eq 


F 


maao >q^ 1 1 rvn 1 1 

IVll/UU J "Q I I LJ, LJ 1 1 




311 


429794 


RTA00002668Fx.09. 1 .P.Seq 


F 




PHORI MM 


312 


72797 


RTA0000266 1 F.e.07. 1 .P.Seq 


F 


iviuuuu-' -oik. . ru_ 


rno irnH 


313 


429992 


RTA00002668F.C.2 1 . 1 .P.Seq 


F 


N/1 AAA i^O^ 1 R-MOS 


runQi \iu 


314 


374410 


RTA00002674F.U l.2.P.Seq 


F 


M00039I >XR riP 

ITIUuuJ ' 1 - O I_> , VJ 1 _ 


THOQI Ml 


315 


376553 


RTA00002674F,g. 1 9. 1 .P,Seq 


F 


1VIWUJ7 1 J7.*\.V.l/7 


pwoor Ml 


316 


429096 


RTA00002666F.f. 1 6. 1 , P.Seq 


F 


l»lUUvJ-r ■ Ot\.\J\J\f 


fUflQ! MU 


317 


181948 


RTA00002663F.n.05.!.P.Seq 


F 


mooo^ "oo no7 


THO'i.VIAH 

^— i IVJ IVI^-\I 1 


313 


378475 


RTA00002672F.h.0 1 2P.Seq 


F 


M000 * 90060- R0I 


THOQf Nil 


319 


427336 


RTA00002665F.C.23. 1 .P.Seq 


F 


MOOO^S^ lOB DO - * 

l'IUUV/-U . 1 UU. L/vf — 


CH08L NiH 


320 


374042 


RTA00002672F.a.08.2.P.Seq 


F 


MOOO^So^ IC BiO 


CH09LNL 


321 


386543 


RTA00002672FX 1 3.2.P.Seq 


F 


M000^8°99B Gl 1 


CH09LNL 


322 


376659 


RTA00002678F.il. 1 1 .2,P.Seq 


F 


M000J94T5C:E10 


CH09LNL 


323 


29135 


RTA00002663F.C.09. 1. P.Seq 


F 


M0002lo:3C:Dll 


CH03MAH 


324. 


377967 


RTA0000268IF.m.l7.2.P.Seq 


F 


M00039SO7D:CI0 


CH09LNL 


325 


431330 


RTA00002668F.m.l6.2.P.Seq 


F 


MO0033O^4A:C08 


CH08LNH 


326 


373824 


RTA00002680F.i. !9.2.P.Seq 


F 


M00039S08D:H02 


CH09LNL 


327 


50094 


RTA0000266IF.j.O:2.P,Seq 


F 


M00004378A:B10 


CH01COH 


328 


214272 


KTA00002664F.h.03.2.P.Seq 


F 


M00027366A:FII 


CH04MAL 


329 


377293 


RTA00002680F.bJ72.PSeq 


F 


MO00:^^"C:E05 


CH09LNL 



IS 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


330 


195053 


RTA00002663F.il, 16, LP.Scq 




M00023044B:D02 


CH03MAH 


331 


21274 


RTA00002709F.m.09. 1 P.Seq 


F 


M00007194A:B09 


CH02COH 


332 


376580 


RTA00002675Fb.20. 1 .P.Seq 




M00039212G:C12 


CH09LNL ! 


333 


374725 


RTA00002673Ff.02-2.PSeq 


F 


M00039070D:C02 


CH09LNL 


334 


25238 


RTA000027iOF,n.08,IPSeq 


P 


M00022634D;C08 


CH03MAH 


335 


377337 


RTA00002683 F.1.07.2.P.Seq 


p 


M00040085D:A10 


CH09LNL 

Xw* * • V * 4 ~ 4W 


336 


450485 


RTA00002692F.a.!3.2.P.Scq 


p 


M00042625C:B04 


CH18CON 


337 


21989 


RTA00002709F.h.22. 1 .P.Seq 




M00006861 B;F09 


CH02COH 


338 


58296 


RTA0000266 1 F.i.20.2. P.Seq 


F 


M00004354DE05 


CHOICOH 


339 


379144 


RTA00O02679F.1. 1 4. 1 .P.Scq 


F 


M0003970^D*FO' ? 


CH09LNL 


340 


379690 


RTA00002680F.b.2 1 .2. P.Seq 


F 


M00039778B:GO3 


CH09LNL 


341 


379640 


RTA 0000268 1 F.d. 1 2,2. P.Seq 




M000398^9C G10 

l»i WVW " * X* • x^ i i# 


CH09LNL 


342 


373988 


RTA0000^673F h 73 | P Sea 


F 


M00039079 A' A05 


CH09LNL 


343 


373988 


RTAOOO0' > 673F h 23 ^ P Sea 


T 


M00039079A AOS 


CH09LNL 


344 


380673 


RTAOOO0^673F i 13 7 PSea 


P 


M00039084C H03 


CH09LNL 


345 


55243 


RT AOO00^66 1 F i 06 7 P Sea 





1^00004^8^0^ 1 1 


CHOICOH 


346 


40557 


RTA0000^7 1 3 F h 1 I P Sea 


" ~F 


MO00* ,l 7398C F07 

|T|WV« ' J ' *J - 4 V ' 


CH04MAL 

X* 1 1 V~ 1 V 1 i 1 1— * 


347 




RTA0000">677F m 03 ! P Sea 


— " — F 


iVl000*94l 7 A DO 3 


CH09LNL 

X» I Iv/Li'L 


348 




RTAOOO0' > 679F i 0^ 1 P Sea 


" ~F 


M00010689CE08 


CH09LNL 


349 


43039? i 


RTA0000" , 668F k 19 1 PSea 




M000- >0*7D'C1 1 


CH08LNH 


350 


376746 


RTA0000^674F f 1° 1 P Sea 


p 


-MOO0'°133BF08 


CH09LNL 1 


351 


1 1 J J 7 J 


RTAOOO0' > 7 1 3F e 07 1 P Sea 

IX 1 r»WW™- ' f J 1 .v» »i# » - ■ - 1 - JVM 


p 


M000"*7' 7 9" A C04 


CH04M AL 


35^ 


377 1 S" 7 

Jill o— 


RTA0000^68^F IN 1 P Sea 


F 


MOOlUOU 1 0 A - F 1 0 


CH09LNL 


353 


380659 


RTA0000^6S4F e 0 7 ^ P Sea 


p ™ 


M00040I "'-ID HOI 

l"IVvW , "'J 1 — ~ 1— i* i 4 I V l 


CH09LNL 


354 


J 1 JOU- 


RTA0000' , 67 1 F 2 0 1 IP Sea 


F 


M000"?8^84B H04 


CH09LNL 

X*- 1 IV 7 4 ' iw 


355 


376096 


RTA0000 7 677F b 16 ~* P Sea 


P 


M000-9340A D0^ 


CH09LNL 

X— ■ I w ✓ *vj I ' ft— < 


356 


37^887 


RTA0OO0°67OF d 05 7 P Sea 


— F 


M000333^8A H12 


CH09LNL 


357 


378475 


RTA0000267*>F a ^4 7 P Sea 





M000 ^90060' B0 1 


CH09LNL 


358 


427336 


RTA0000^665F c *>3 3 P Sea 


F 


M0002S210B:D02 


CH08LNH 


359 


373814 


RTA0000 1 67 -> F b 07,2, P Sea 


P 


M0003S635A.G09 


CH09LNL 


360 


215506 


RTA00002664F.H.08.2. P.Seq 




M00027438C:G07 


CH04MAL 


361 


374465 


RTA00002673F.C.07.2, P.Seq 


P 


M00039058C:H02 


CH09LNL 


362 


428784 


RTA00002667F.C, 18,!, P.Seq 




M00032744B:FIO 


CH08LNH 


363 


379581 


RTA00002676F.a.2 1 .2, P.Seq 


F 


M00039273B;F02 


CH09LNL 


364 


378371 


RTA00002678F.f.20.2. P.Seq 


' - p 


M00039465A;A08 


CH09LNL 


365 


375154 


RTA00002676F c.13.2_P Seq 


f 


M00039:79B:H02 


CH09LNL 


366 


431214 


RTAG0002669F. k. 04. L P.Seq 




M00033262D:AI 1 


CH08LNH 


367 


376053 


RTA00002675F.I.03. 1 P.Seq 


.... 


M00039249A:CI2 


CH09LNL 


368 


373282 


RTA00002680F.j. 19.2. P.Seq 




M00039813B.D1 1 


CH09LNL 


369 


33397 


RTA0000266 1 F.h.04. 1 .P.Seq 


F 


M00004I6SA:G1I 


CHOICOH 


370 


376706 


RTA00002675F.C.02. 1 .P.Seq 


p 


M0O0392l3B;f05 


CH09LNL | 


371 


378292 


RTA0000268 1 F.i.09.2 ; P.Seq 




M00039880A:H11 


CH09LNL 


372 


431612 


RTA0000_669F.e.23.3.P.Seq 




M00033202D:G06 


CH08LNH 


373 


378471 


RTA0000Z679F.0. 17.1. P.Seq 




M0003972^C;B09 


CH09LNL 


374 


378666 


RTA0000I6SIF.L05.2. P.Seq 




M000:^8;9C;F05 


CH09LNL 


375 


374894 


RTA00002675F.f.04. 1. P.Seq 




M00039224A:E12 


CH09LNL 


376 


430191 


RTA00002667F.j.24.!.PSeq 




M00032829B:E06 CH08LNH 



1^ 



WO 01/02568 



PCT/US00/18374 



ID 




5tQ NAME 


ORlENTATIOf 


* CLONE (D 


LIBRARY 


*77 
111 


428581 




F 


MO0032739A:A06 


CH08LNH 


! 379598 


RTA00002679F.k.03. 1 ,P.Seq 


F 


M000396973 F 1 1 


CH09LNL 






RTA0O0027 1 OF j.23. LP.Seq 


F 


M00022434D:D06 


CH03MAH 






RTA00002709F.b. 1 0. 1 .P.Seq 




M00005384A:CM 


CH02COH 


JO 1 


379928 


RTA00O02679F,o,06. 1 .P.Seq 


F 


M00039720D:D02 


CH09LNL 


Jol 


430191 


RTA00O02667F.k.0 1 . 1 .P.Seq 


F 


M00032829B:E06 


CH08LNH 


Jo J 


^74684 


RTA00002675F.g,02. 1 .P.Seq 


F 


M00039223A:B05 


CH09LNL 


Jo 4 


^7^728 


RTA00002676F,h.05.2.P.Seq 


F 


M00039299B:GI2 


CH09LNL 




1 230237 


RTA00002670F.b.08.2, P.Seq 


F 


M00033306D:H09 


CH09LNL 


Job 


380673 


RTA 00002673 F.j. 13.1. P.Seq 


F 


M00039084C:H03 


CH09LNL 


J5/ 


378938 


RTA00OO2679F.k.2O. 1 .P.Seq 


F 


M00039702A:B12 


! CH09LNL ! 


TOO 


3751 15 


RTA00002673F.e.0 1 . 1 .P.Seq 


F 


M00039066D:G08 


CH09LNL 


i on 

389 


378673 


RTA00002680F.p.2 1 .2.P.Seq 


F 


M0O039838A:F05 


CH09LNL 


390 


372909 


RTA000O267OF,a. !2.2,RSeq 


F 


M00033300D:HI2 


CH09LNL 


1 

j9| 


i 373300 


RTA00002674F.C.2 1 . LP,Seq 


F 


M00039I26D:A08 


CH09LNL 


392 


379318 


RTA00002683F.h.l6.2.P.Seq 


F 


M0004007IB:A10 


CH09LNL 


393 


378319 


RTA0000268 1 F.k.07.2.P.Seq 


F 


M0O039890A.HO5 


CH09LNL 


394 


374608 


RTAO0OO2675F.g.20. 1 .P.Seq 


F 


M0O03923OA:AIO 


CH09LNL 


395 


374328 


RTA00002673F.c.24.2.P.Seq 


F 


M00039061B:F08 


CH09LNL 


j 96 


374328 


RTA000O2673F.d.0l.2.P.Seq 


F 


M0003906!B:F08 


CH09LNL 1 


j97 


428401 


RTA00002667F.b.07, 1 .P.Seq 


F 


M00032725C;F06 


CH0SLNH 


j98 


136202 


RTA00002637F.p.05.2.P.Seq 


F 


M00040349D;B09 


CH14EDT 


399 


374394 


RTA00002673F.C. 15.1 .P.Seq 


F 


MOO039059C;G0S 


CH09LNL 


400 


37784 


RTA0OOO2708F.Q. 1 7. 1 .P.Seq 


F 


M000038I6D:EI 1 


CH0ICOH 


401 


378282 


RTA0000268 1 F.h. Ill .P.Seq 


F 


M00039876D:H09 


CH09LNL 


402 


185663 


RTA000027 1 2F.p. 1 7.2. P.Seq 


F 


M00027l78B:G09 


CH04MAL 


403 


14866 


RTA O0OO2709F,d. 14.1. P.Seq 


F 


M00005623D:G!2 


CH02COH 


404 


383502 


RTA000O2670F.k.07.2.P.Seq 


F 


M00033446D:B02 


CH09LNL 


403 


13463 


RTA00002709F.f. 1 8. 1 .P.Seq 


F 


M00006657C:G05 


CH02COH 


406 


21274 


RT A00002 709 F . m . 09. 2 . P. Seq 


F 


MOOO07|94A:B09 


CH02COH 


I/IT 

407 


13745 | 


RTA000027 l4F,b. 1 3, 1 .P.Seq 


F 


MOO027S01 CC11 


CH04MAL 


408 


23485 


RTA000027 14F.C, 10. 1 .P.Seq 


F 


M00027836D:F12 


CH04MAL 


409 


428364 


RTA00002667F.C.09. 1. P.Seq 


F 


M00032737B:E09 


CH08LNH 


/I 1 A 

410 


431629 


RTA00002669F.1. 14.2. P.Seq 


F 


M00033276B:G08 


CH08LNH 


41 J 


j797}4 


RTA 00002682 F.h.08. 1 P.beq 


F 


M00039983D:A06 


CH09LNL 


/l 1 7 

4 1 Z 


4 j 1601 


RTA00002669F.k.08 i 2.P.Seq 


F 


M0003J263B:G04 


CH08LNH 




3 0749 


RTA0000^6SOF.f.2_>.2.P.Seq 


F 


M0O0:^795D:G06 


CH09LNL 


/l l i 
414 


378/64 


RTA0000263 1 Fj.04,2. P.Seq 


F 


M0003O884AHM 


CH09LNL 




i 1 3603 


R rA000O_664F.i.20. 1 .P.Seq 


F 


M0002^647C:D03 


CH04MAL 


4 10 


J / o 1 44 


RTAOOOO-6 oF.j.09. LP.Seq 


F 


M0003^241A:E1 1 


CH09LNL 


417 


373071 


RTA0000^670F i ^ P Sea 




{VIUUU j _w4_ A . UU6 


LH09LNL 


418 


379684 


RTA0000:68IF.c.09.2.P.$eq 




M0O0:^S5lB:Gl 1 


CH09LNL 


419 


379610 


RTA000OZ6$0F.k.l I.2.P.Seq 




M0005*3I5C:F09 


CH09LNL 


420 


22392 


RTAO00O::08F.a.lO.I. P.Seq 




MOO00 1 395 D:K02 


CH01COH 


421 


377555 


RTA 00002683 F.l.08.2, P.Seq 




M000^0085D:E04 


CH09LNL 


422 


32624 


RTA000027 13 FT. 15.1. P.Seq 




M00O:^347C:GO^ 


CH04MAL 


423 


375024 


RTA0000:o"'?F.p.l2.I.p 1 Seq 




M000:^266D*r 12 


CH0°LNL 



17 



WO 01/02568 



PCT/US00/t8374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


I CLONE ID 


LIBRARY 


424 


374725 


RTA00002673FJ.02, 1 .P.Seq 




M00039Q70D:C02 


CH09LNL 


425 


376228 


RTA00O02676F.f.l9.2.P.Seq 


F 


M00039293A:H04 


CH09LNL 


426 


375906 


RTA00002675F.U 8. LP.Seq 


P 


M00O39238D:A08 


CH09LNL 


427 


186190 


RTA0O0027 1 4F.a.04. 1 P.Seq 


p 


! M00027729D:H06 


CH04MAL 


428 


57694 


RTA000027 1 3 F.tl02. 1 .P.Seq 


p 


M00027319D:BM 


CH04MAL 


429 


7007 


RTA00002709F.d.08. 1 P.Seq 


p 


M00G05614B:B0I 


CHO^COH 


430 


400084 


RTA00002685F.O.I9 Z.P.Seq 


p 


M00059641C:D07 


CH12EDT 


431 


375648 


RTA00002676F.il* 18.2.P.Seq 


F 


M00039301B.F06 


I CH09LNL 


432 


166493 


RTA00002663F.H.08. 1 .P.Seq 




M0O022492C.A02 


CH03IV1AH 


433 


379632 


RTA00002682F.h. 14, LP.Seq 


p 


M00039984B:GI2 


CH09LNL 


434 


373234 


RTA00002676F.g. 1 5,2.P,Seq 


F 


M00039297C:H08 


CH09LNL 


435 


401230 


RTA00002685F.i,05.2.P.Seq 


" "p " 


! M00039533A:Cl2 


CHPEDT 


436 


186623 


RTA000027I2RF. 15. LP.Seq 


' F ■ 


M00026843B:DIO 


! CH04MAL 


437 


127714 


RTA000027 !2F.k. 14. 1 .P.Seq 


' P ™ " 


M000^70I3A C09 


CH04MAL 


438 


451857 


RTA00002692 F.jlO 1 . 1 .P.Seq 


J 


M00042584B.CIO 


CHI SCON 

Xw til \_/ 1 ~ 


439 


404620 


RTA00002687F.c.03.2.P.Seq 




MOOO'9770A Gl 1 


CHI4EDT 


440 


186872 


RTA00002663F.k.23. 1. P.Seq 


F 


M000227978:G08 


CH03MAH 


441 


42729 


RTAOOO02709F.c.O6.2.P.Seq 


F 


M0000 i >458A Fl 1 


CHO^COH 


442 


373380 


RTA00002674F.b.07. 1 .P.Seq 


F 


MOOO^QPjABIO 


CHOQI MI 


443 


374465 


RTA00002673F.C.07, 1 .P.Seq 


F 


M000-9058C HO** 


CH09LNL 


444 


403557 


RTA00002687F.d.l0.2P,Seq 


f " F 

( ~ 


M000*994SA E03 


CHUEDT 


445 


16749 


RTA00002709F.bi4.2-P.Seq 




M000054023:F08 


CHO^COH 


446 


375592 


RTA00002680F.f.22.2.P.Seq 


F 

I 


MOO0 : 979SD EI0 


CH09I N 1 


447 


376103 


RTA00002676F.g.06.:.P.Seq 




M000»9* , 953 D03 


CH09LNL 


44$ 


40228 


RTA000027 12F.1.18.1 .P.Seq 


F 


M00027049B:F05 


CH04MAL 


449 


374606 


RTA0O002673F.J.23. 1 .P.Seq 


P 


M000-W6A A05 


CH09LNL 


450 


378270 


RTA000026S0F.h.08.:,P,Seq 




M000J^80I A:H1 1 


CH09LNL 


451 


236321 


RTA0000266SF.k. 14. LP.Seq 


p 


M000.U034CF02 


CH08LNH 


452 


378676 


RTA0OOO26SOF.m.2O.2.P.Seq 


p "" 


M000_^827B:F07 


CH09LNL 


453 


373252 


RTA00G02670F.U6.2.P.Seq 


F 


M0003345IA:H01 


CH09LNL 


454 


384601 


RTA00002670F.k.06.2.P.Seq 


p 


M00033446GG08 


CH09LNL 


455 


403772 


RTA00002687F.3.0J.2.P.Seq 


p 


M000J9746GG09 


CHUEDT 


456 


379566 


RTA000026S3F i k.04. 1 P.Seq 


p 


M00040081C:E01 


CHQ9LNL 


457 


136202 


RTA000026S7F.p.05. 1 P.Seq 


p 


M00040349D-B09 


CH14EDT 


458 


14317 


RTA0OO027 1 3F.C.13. LP.Seq 


F 


M00027248A:C02 


CH04MAL 


459 


375349 


RTA00002672Fj.l 1.1. P.Seq 




M000.^O:4B;Bl0 


CH09LNL 


460 


403020 


RTA0OOO2637F.a.0:.2.P,Seq 


F 


M000.^r46C:A08 


CHUEDT 


461 


374060 j 


RTA00002672F.L07. 1 P.Seq 


p 


M000J^014B:C04 


CH09LNL 


462 


183399 


RTA000027 1 2F.o. 1 0. 1 .P.Seq 


p 


M000:7I36C:C09 


CH04MAL 


463 


373789 


RTA0000267 1 F.c,20. LP.Seq 


F 


M000-3S:59B:G08 


CH09LNL 


464 


20168 


RTA000027 1 1 F.b.22, 1 .P.Seq 


p 


mooo::sj4B:Gi i 


CH03tMAH ! 


465 


452641 


RTA00002692F.d.05.:.P.Scq 




M00043003C:D08 


CHI SCON 


466 


431370 


RTA0000266 c )F.m.04.:, P.Seq 




mooo;j:ssb:D12 


CH08LNH 


467 


153044 


RT A 00002 7 1 3 F . j .03 . L P . Seq 




M0O0r4"6A:CO9 


CH04MAL 


468 


373229 


RTA00002679F.C. !6.:.P.Seq 




M000.'%63C:G09 


CH09LNL 


469 


374328 


BTA00002673F.d.O M .P.Seq 




M000.^06IB:F08 


CH09LNL 


470 


39606 


RTA000027l3F.i.2Q.i. P.Seq 




M000r46SA:C09 


CH04MAL 



1% 



WO 01/02568 



PCIYUSOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


471 


59077 


RTAO0002713F.n.0LLPSeq 


F 


M00027596CE06 


CH04MAL 


472 


1935 


RTA000Q27l0F.b. It. LP.Seq 


F 


M00008006B:B03 


CH03MAH 


473 


379684 


RTA0000268IF.C.09. LP.Seq 


F 


M0003985IB:GM 


CH09LNL 


474 


451564 


RTA0000269|Rf.l2.2.P.Seq 


F 


M000434MD:H06 


CH17COHLV 


475 


7571 


RTA000027 1 OF.a. 1 5. LP.Seq 


F 


M00007943D:C09 


CH03MAH 


476 


129323 


RTA00002713F.k.2L LP.Seq 


F 


M00027525B:D06 


CH04MAL 


477 


12960 


RTA000027IOF.a.23,LP i Seq 


F 


M00007976A:C10 


CH03MAH 


478 


186730 


RTA000027 1 3 F.o.05 . 1 .P.Seq 


F 


M00027641C.A03 


CH04MAL 


479 


59077 


RTA000027 1 3F.m.24. 1 .P.Seq 


F 


M00027596CE06 


CH04MAL 


480 


185884 


RTAQ0002712F.b.06. 1. P.Seq 


F 


M000233f6C:G08 


CH04MAL 


481 


19471 


RTA00002708F.g.08. 1 P.Seq 


F 


M00004197B:H10 


CH01COH 


482 


45206 


RTA000027 10F.C.06. LP.Seq 


F 


M0O008O63B;AO6 


CH03MAH 


483 


404257 


RTA00002687F.g.06.2.P.Seq 


F 


M00040208A.C03 


CH14EDT 


484 


372997 


RTA00002679F.p.04, \ ,PSeq 


F 


M00039729A:AIO 


CH09LNL 


485 


43792 


RTA000027 1 3F.k. 1 6. 1 .P.Seq 


F 


M00027520A:C05 


CH04MAL 


486 


400052 


RTA00002687F.H. 1 3 .2.P. Seq 


F 


M0004029ID:C05 


CH14EDT 


487 


452194 


RTA00002692F.C. l4.2.P.Seq 


F J 


M00042988A:F06 


CHI SCON 


488 


24034 


RTA 000027 1 OF.b-06. 1 P.Seq 


F 


M00007992CF06 


CH03MAH 


489 


447544 


RTA000G2689F.e. 18,1. P.Seq 


F 


M00042905D:D02 


CH15CON 


490 


401872 


RTA00002686F.C.23. LP.Seq 


F 


M00040I4!0:F05 


CH13EDT 


491 


376553 


RTA00002674F.g. 19.2. P.Seq 


F 


M00039I39A.C09 


CH09LNL 


492 


455051 


RTA00002694F.a,07, ! .P.Seq 


F 


M00042595A.AII 


CH20COHLV 


493 


16760 


RTA00002708F.J.03. 1 .P.Seq 


F 


M00004393B:£07 


CH01COH 


494 


374174 


RTA00002672F.i, 1 2,2.P.Seq 


F 


M00039015A:D07 


CH09LNL 


495 


374283 


RTA00002672F.k.2L2.P.Seq 


F 


M00039030B:E02 


CH09LNL 


496 


375772 


RTA0000268 1 F.o.24. 1 .P.Seq 


F 


M0Q0399O9CGO5 


CH09LNL 


497 


376417 


RTA00002678F.i.03.2.P.Seq 


F 


M00039477D:A|0 


CH09LNL 


498 


423971 


RTA00002666F,o,02. 1 .P.Seq 


F 


M00032678C.D06 


CH08LNH 


499 


394098 


RTA00002681FjJ5.LP.Seq 


F 


M00039887CE07 


CH09LNL 


500 


37976 L 


RTA00002670F.n.03. 1 .P.Seq 


F 


M00033561C:A02 


CH09LNL 


501 


374266 


RTA00002674F.L08. 1 .P.Seq 


F 


M00039144C £06 


CH09LNL 


502 


372946 


RTA00002670F.L07. LP.Seq 


F 


M00033457D;A05 


CH09LNL 


503 


228909 


RTA00002664F.e.08, 1 .P.Seq 


F 


M00027085C;Ell 


CH04MAL 


504 


427524 


RTA00002665F.e.05, 1 .P.Seq 


F 


M00028354D:A03 


CH08LNH 


505 


380413 


RTA00002680F.k.i9.2,P,Seq 


F 


M000398I6C:D05 


CH09LNL 


506 


373866 


RTA00002671F.C.24.2. P.Seq 


F 


M00038259C:H09 


CH09LNL 


507 


427202 


RTA00002665Fg.Lv LP.Seq 


F 


M00028617C:AI2 


CH08LNH 


508 


373000 


RTA00002670F.J. 13. 1 .P.Seq 


F 


M00053437C:C03 


CH09LNL 


509 


378838 


RTA00002678F.p.lL2.P,Seq 


F 


MQQ0396J7C:AI0 


CH09LNL 


510 


24945 


RTA00002710F.p.05. LP.Seq 


F 


M00022739A:B03 


CH03MAH 


511 


20277 


RTA000027IOF.e.l7.LP,Seq 


F 


M000Z1972D:CU 


CH03MAH 


512 


20820 


RTA000027 1 0F.e.02. L P.Seq 


F 


M0002I9IQC:A10 


CH03MAH 


513 


376791 


RTA00002674F.l.l7.2.P.Seq 


F 


M00059166B:G06 


CH09LNL 


514 


9809 


RTA000027l0Fg.l2. LP.Seq 


F 


M00022I7SB.D06 


CH03MAH 


515 


429562 


RTA00002667F.m.03. 1 .P.Seq 


F 


M00032853D:G12 


CH08LNH 


516 


12920 


RTA00002710F.fi. 15. LP.Seq 


F 


M00021964CE10 


CH03MAH 


517 


377565 


RTA00002684F.h. 19. LP.Seq 


F 


M000-U)509A:EU 


CH09LNL 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 



CLUSTER 



SEQ NAME 



(ORIENTATION 



CLONE ID 



LIBRARY 



429356 
427634 



RTA00002668F.d.23.I.P,Seq | 



M00032933A:C10 



CH08LNH 



519 
520 



521 



522 



523 



42771: 



RTAQQ002665F.f.Q9, 1 .P.Seq 



373607 



RTA00002665F.e,23. 1 .P.Seq. 



378781 



RTA00002674F.d. 1 5.2.P.Seq 



RTA00002674F.0. 1 4. 1 .P.Seq j 



429361 
126754 



RTAQQ002666F.d. 11.1 .RSeq| 



M00028369D:E08 



CH08LNH 



M00028364C:G08 



CH08LNH 



M00039I27D:E10 



CH09LNL 



M000391963:H06 



CH09LNL 



M00032550D:C02 



CH08LNH 



524 



525 
526 



527 
528 



529 



530 



RTAQQ0Q2663F.a. 1 6. 1 .P~Seqj 



428047 



18863 



RTA00002665F.k.l0.1.P,Seq 1 



379761 



RTA00002709F.d. 1 5. 1 .P.Seq [ 



46407 



RTA00002670F.n.Q3.2.P.$eq [ 



RTA0Q002665F.C. 10.1 P.Seq j 



21365 



427466 



RTA00002709F.k.06,LP,Seq 1 



RTA00002665F.b. 11.1 .P^eq| 



M00008045A:H02 



CH03MAH 



M00031417C:G09 



CH08LNH 



M00005625A:C02 



CH02COH 



M0OO33 56 1 C:A02 



CH09LNL 



M00028I96D:A03 



CH08LNH 



M000070l2O:H08 



CH02COH 



M00028184D;G10 



CH08LNH 



531 



400265 



380056 



RTA00002685F.c.03.2.P^eq| 



RTA00002680F.a. 16.2.^Scq| 



M00039374B;B07 



CH12EDT 



M00039773D:FII 



CH09LNL 



533 
534 



535 



536 



537 



375324 



RTA00002678F.U2.2.P t Seq 



25165 



RTA000027 IQF.k. 1 7. 1 ^$eq| 



401296 



RTAQ0002635F.h.23.2.P.Seq| 



394098 



RTAQ0002631F.j,15.2.P.Seq 



17430 



RTAOQ002710F.i.l UP.Seq 



M000396I23:B10 



CH09LNL 



M000224963:EC 



CH03MAH 



M00039529C:D07 



CH12EDT 



M00039887C:E07 



CH09LNL 



M00022365D:A03 



CH03.V1AH 



538 



539 



540 
541 



542 



543 



544 
545 



546 



547 



548 
549 



373820 



RTA00002674F.d.Q6.i.P,Scq [ 



378548 



RTA00002672F.g, 14.2.P.Seq | 



222679 



RTA00002664F.fJ 8,2. P.Seq 



576874 



RTA00Q02670F.e.23. 2. P.Seq |j 



21329 



RTA000027Q9F.b.08. 1 .P.Seq | 



119905 



RTA0000271QF.p.l3.1.P.Seq 



377028 



373351 



RTA00002678F.n.2 1 .2, P.Seq | 



RTA0O002671F.1. 18.3 P.Seq 



376082 



RTA00002674F.m. 1 7. 1 .P.Seq} 



376987 



61921 



RTA00002678F.g.2 1 .2.PSeq 



RTAQ000:66 1 F.g.08. 1, P.Seq [ 



373486 



RTA00O02672F.b.Q3.2.P.Seq | 



M00039127A:G11 



CH09LNL 



M000390043:CM 



CH09LNL 



MOO0:722SD:A0l 



CH04MAL 



MOOO33375a:G04 



CH09LNL 



M00005379A:E04 



CH02COH 



M000:27S5C:G06 



CH03MAH 



M0003963IA:CI0 



CH09LNL 



M0005S327D:A05 



CH09LNL 



M00039I713:DII 



CH09LNL 



M00039472CB08 



CH09LNL 



M000039953:E03 



CH01COH 



M00038635B:C08 



CH09LNL 



550 



380355 



RTA00002670F.o.Q6.2.P.Seq [ 



M00033570C:C10 



CH09LNL 



551 



552 



553 



554 



555 
556 



557 



430295 



RTA00002667F.h. 14. 1 .P.Seq | 



379221 



RTA00002682F.n.0 1.1 P.Seq [ 



373532 



RTA00002672Fd.102.PSeq j 



375633 



RTA00002677F.m.05.2. P.Seq [ 



378356 



376196 



RTA0000268!F.r'.07.1. P.Seq 



RTA00002674F.m. 12,1 .P.Seq | 



375115 



RTA00002673F,d.24 2,P,Seq j 



M00032808B:G10 



CH03LNH 



M000400I7D:G03 



CH09LNL 



M0003899IA:DOI 



CH09LNL 



M000394I7B.F01 



CH09LNL 



M00039S66B;A0S 



CH09LNL 



M00039I70OF05 



CH09LNL 



M00039066D:O08 



CH09LNL 



558 
559 



560 



561 



562 



375115 



RTA00002673F,e,01,2 > P,Seq 



378600 



RTA00002679Fi.Q3.1. P.Seq 



375351 



RTA0QQO268QF.e. 15.1. P.Seq 



25237 



RTA0000:710F.n.23.l. P.Seq [ 



193503 



RTA0OQ0:663F.n t 15.1. P.Seq | 



M00039066D:G08 



CH09LNL 



M00039686OE06 



CH09LNL 



M00039792A.304 



CH09LNL 



M00022671BA08 



CH03MAH 



M00023039D:305 



CH03MAH 



563 
564 



428268 



.RTA0000:667F.b.0 1.1. P.Seq 



379440 



RTA0Q00:683Fj.2L2.P.Seq 



%0 



M00032724A.C05 



CH08LNH 



tM00040080C:C06 



CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORJENTATIOf 




LIBRARY 


565 


374302 


RTA0000 7 673F i 08 7 P Sen 




mW0j908(JC .H0o 


CH09LNL 


566 


240615 


RTA00002672F.e,19.2.P.Scq 


F 


\A(\f\/\ i ffoo h rv - e a-; 


CH09LNL 


567 


379207 


RTA00002670F.b.07.2,P.Seq 


F 


JWUUU j J j UO U . Cj 05 


CH09LNL 


568 


427893 


RTA00002665F.k. 1 9. 1 .P.Seq 






CH08LNH 


569 


377530 


RTA0000* > 684F * 19 1 P Sea 


p 


MU0040jODA.U 1 1 


CH09LNL 


570 


429707 


RTA0000^663F c. 1 1 IP Sea 





iViUUUjZV 1 oL.b 10 


CH08LiNH 


571 


427610 


RTA0000" , 665F i 04 1 P Sea 


F 


MUUUz 0 / / U A , U04 


CH08LNH 


572 


100699 


RTA00002662F,b.22.2.P.Seq 


~ F 




ft frtT/~AI f 

CH02GOH 


573 


378974 


RTA 00007687 F m "> II P Sen 


r 


hiffAAA.IAA 1 "T 4 . s^f\f 

rW000400 1 7A:C06 


CH09LNL 


574 


373607 


RTAOO0O" ) 674F d 15 1 P Sen 


~ 


\A A A A "* A i ITrv. r i A 

MUO0 j v 1 2 / V. :t 10 


j CH09LNL 


575 


26^9i 1 


RTAflO0O"*66^F d 04 i P Sen 




kXAAA*>o^ i cn.rrt^ 

M0002 82 1 3 D: FOj 


CH08LNH 


576 


30748 


RTAOO0O -) 71 3F e 1 I 1 P 'sm 


F 


M00027j01B.B08 


CH04MAL 


577 


161 1 16 


RTA00flfP7l4F ell 1 P v> n 


F 


M000278j /C:D09 


CH04iMAL 


578 






c 


M0004002 9 A : G04 


CH09LNL 


579 


430689 


RTA0000 -, 669F i 74 | p Sen 


P 


ivjIAAa^ ■*> ,i ■"■ r> . *a* 

M000j^24j B.AO^ 


CH08LNH 


580 


J f *♦ I *L 


RTA 00007*7": F 1 77 1 P 




L« — _ 


M000j9I04D:C09 


CH09LNL 


581 


37652 1 


RTA00007677F h 06 "» P Sen 




iVI000j9^ 9 8 A : B 1 0 


CH09LNL 


582 


3778 ,4 




— — 

— — ^ 


M000jj^08B:G0d 


CH09LNL 


583 


379014 


RTA00007AJPF o 07 | BC^n 




M0004002 2 G : D06 


CH09LNL 


584 


376344 


RTA00007A77F b 1 R 1 P Sen 


5 


M000j9j40B:G08 


CH09LNL 


585 




RT A0fl0fl7A7 n i: fni T P 


= 


M000^9288C;B 1 1 


CH09LNL 


586 


7 \£\f\\ 
z. i 1 


rv | a>vWVJ— / i/~r .c.io.i ,r Oct, 


F 

_ L — 


k J AAA A £ rt/ 1 f"A i 

M00005820C:EO4 


CH02COH 


587 




RTAOOnfPAT^F h \ R 1 P •vWi 


1 


M000j921 I A:CI2 


CH09LNL 


588 




RTAOOOO^fSrtQF h M \ P 




M000jj22jB:HO7 


GK08LNH 


589 


l6"PQi 

1 UJ.7J 


RTA00On' > 7mF r 7ft 1 P xWi 


— 


rVi0002S 1 20D:F 12 


CH04MAL ! 


590 


178614 


RTA0000^7I "5F r 70 1 P x> n 


F 


M0002726j A:F 1 0 


CH04MAL 


591 


373274 


RTAOflOO^fiTOF i 77 "> p Q-r, 


F 


llXAAA' , ' , 1 P> MIA 


CH09LNL 


592 


379820 


RTA0000767QF f 1 S I P 


— 


&JAAA ■* Cltt \ .a AO 

rVlUUOjvo/ /A:dOo 


CH09LNL 


593 


160536 


RT 400007663 F f 10 1 P Sen 


T 


KjIAAATII * 1^ 

M000222 j jC: A 1 2 


CHOjMAH 


594 


373313 


RTA00007671 F mO' 1 1 P Sea 





KvlAAA19' > "^Q p*. * a ^ 
IV| UUU J 0 J J O Ll . A 0 J 


CH09LNL ! 


595 


26429 


RTA0000~>7PF k 73 i p <v^ n 


F 


MUUU-. /U-JIJ.Cj 1 1 


f 1 IA It < i 1 

CH04MAL 


596 


17983 


RTA0000^7 1 1 F f 10 I P Sea 


F 


K.lAAA'^'iATO \ AAC 


1 IA > * * 4 1 1 


597 


375388 


RTA0000^68 1 F i 77 7 p c-a 


F 


\jIAAA7O0O0 D . I~\A "* 

rVlUUU jVooo B. DO j 


/"* i innr x 1 1 

CH09LNL 


598 


63005 


RTAOOOO^PF m *M 1 P Sea 

• X 1 f \VUVV-» ' 1 m* I till.*— 1.1.1 .JVU 


r 




/~ • i * a tit t i 

CH04MAL 


599 


23030 


RTA0000^709F b 10 "> P Sea 


— — 




CH02COH 


600 


372946 


RTA00002670F.1.07.2. P.Seq 


" c 


|V|UUl/J J4J / L/./AlO 


z - • i i An i xi t 

CH09LNL 


601 


375351 


RTAOOOO26S0F.e.l5.2.P,Seq 


— 


Manfi^OTO"* \ • nn i 
rviuuujv / dU4 


pjjAAl Xlf 


602 


374502 


RTA 00002673 F.i.08. 1. P.Seq 


— = 


rUf nan ^ on 9 n r~ • u a £ 


r™ Ll Art J Vii 


603 


37691 1 


RTA00002682F.e,09. 1 P.Seq 


— p 




ruAQI VI 1 


604 


376024 


RTA00002675F.n. 1 5. 1 .P.Seq 


F 


Mono ^^Tn-rfti 


f^LJAQI XJl 


605 


377194 


RTA 000026 79F.h.20.1. P.Seq 


F 


M00039685A:A08 


CH09LNL 


606 


37Q643 


RTA000026S2F.g.08.l. P.Seq 


F 


M00039978A:GO3 


CH09LNL 


607 


379610 


RTA00002680F.k.M.!.P.Seq 


F 


M00039815C:F09 


GH09LNL 


608 


25613 


RTA000027I lF.g.06. 1. P.Seq 


F 


M00023024D:F12 


CH03MAH 


609 


207466 


RTA00002664F.J.08- 1 .P.Secj 


F 


M00027733A.A02 


CH04MAL 


610 


400052 


RTA00002687F.h. 13.1. P.Seq 


F 


M0004029lO:C05 


CHI4EDT 


611 


21290 


RTA000027l2F.g.OU.P.Seq 


F 


M00026859D;DOI 


CH04MAL 



WO 01/02568 



PCTVUSOO/18374 





/"•I r TCTCD 


SEQ NAME 


ORIENTATE 


J CLONE ID 


LIBRARY 




0 [J. 


375975 


RTA00002675F.n. 18.1 .P.Seq 


F 


M00O39258O:808 


CH09LNL 




0 1 j 


46804 


RTA000027 1 2F.n. 19. LP.Seq 


F 


M00027121D;C05 


CH04MAL 




014 


69863 


RTA000027 1 2F.i. 1 8. 1 P.Seq 


F 


M00026935CB04 


CH04MAL 




015 


375285 


RTA00002676F.g. 1 8.2.P.Seq 


F 


M0003929SB:B06 


CH09LNL 




old 


373000 


RTA00002670FJ. l3.lP.Seq 


F 


M00033437CC03 


CH09LNL 




617 


378679 


RT A0000268 1 F. f. 1 6. 1 .P.Seq 


F 


M00039869B:F06 


CH09LNL 




0 15 


45407 


RTA00O027 l2Fk. 1 1. 1 .P.Seq 


F 


j M000270I6A.B06 


CH04MAL 




019 


16838 


RTAOO0027 1 2F,e.23. LP.Seq 


F 


M00026803A:F08 


CH04MAL 




OiU 


136425 


RTA000027I3F.C.04. LP.Seq 


F 


M00027236A:E04 


CH04MAL 


621 


376485 


RTA00O02676F.e.24.2.P.Seq 


F 


M00039288C:B11 


CH09LNL 




622 


41 108 


RTA000027 1 2F,n, 1 2. 1 .P.Seq 


F 


M00027I08CB03 


CH04MAL 




6/j 


430876 


RTA00002669Fx.02. LP.Seq 


F 


M00033I86CDI1 


CH08LNH 


024 


185716 


RTA000027 13F.L07. 1 p.Seq 


F 


M00027537CBOI 


CH04MAL 


625 


85338 


RTA0O0O27 1 2F.b. 1 8. 1 .P.Seq 


F 


M0OO23333D:CI2 


CH04MAL 




626 


185597 


RT A 000027 1 3F.m.23, 1 .P.Seq 


F 


M00027596A.A10 


CH04MAL 




627 


139348 


RTA000027l3F.k.23. 1 .P.Seq 


F 


M00027526D:F03 


CH04MAL 


628 


454665 


RTA 00002693 F.dT5.2.P.Seq 


F 


M00043I64C;£!2 


CHI9COP 


629 


186387 


RTA000027 1 3F.I.0L LP.Seq 


1 F 


M0002752SCBI0 


CH04MAL 


630 


186387 


RTA 000027 1 3F.k.24. LP.Seq 


1 F 


M0002752SCBI0 


CH04MAL 


631 


21093 


RTAOO002708F.h.20. 1 .P.Seq 


F 


M0000430SC:C06 


CHOICOH 


632 


20827 


RTA 000027 1 OF.c.23 . 1 P.Seq 


F 


M0002I67!D:F12 


CH03MAH 


633 


21290 


RTA000027 12FT.24. 1 .P.Seq 


F 


M0002685OD:D0! 


CH04iMAL 


634 


17646 


RTA000027 1 0F.d.22. 1 .P.Seq 


F 


M0002!90SD:C12 


CH03MAH 


635 


402817 


RTA00002686F.X 10. 1 .P.Seq 


F 


M00039736D:G08 


CH13EDT 


636 


42854 


RTA000027 1 3F.ii.09. 1 .P.Seq 


1 F 


M00027615A:F10 


CH04MAL 


637 


430876 


RTA00002669F.C.02.3. P.Seq 


F 


M00033I86C:D11 


CH08LNH 


638 


37S64I 


RTA00002679F,aJ 1 .2.P.Seq 


F 


M000396i:C:E0S 


CH09LNL 


639 


375848 


RTA000026"4F.m.03.2.P.5eq 


F 


M0003916SC:A04 


CH09LNL 


640 


36165 


RTA00002703F.L06. 1 P.Seq 


F 


M0000434CC:C07 


CHOICOH 


641 


456506 


RTA00002694F.d.05. 1 .P.Seq 


F 


M000434o:a:E01 


CH20COHLV 


642 


374450 


RT A0O0O2672F1 05 .2, P. Seq 


F 


M00039OUA:H!0 


CH09LNL 


64j 


378949 


RTA00002683F.O.2 1 2.P.$eq 


F 


M00040IOOO:B06 


CH09LNL 


£.A A 

044 


373313 


RTAO0OO267 1 F.m.02.2.P.Seq 


F 


. M0003832SD:A03 


CH09LNL 


CA £ 

045 


377861 


RTA0000268 1 F.m.20. 1 .P.Seq 


F 


M0003989$A:A08 


CH09LNL 


646 


43! 196 


RTA00002669F.f.07.2. P.Seq 


F 


M0003320-iB:A07 


CH08LNH 


04/ 


372795 


RTA 00002683 F.a. 06. LP.Seq 


F 


M00040032A:B03 


CH09LNL 


A/I 0 

048 


42340 


RTA0000266 1 F,b.03, 1 .P,Seq 


F 


M0000143^C.H06 


CHOICOH 


04V 


374410 


RTA00002674F.k. 11.1 .P.Seq 


F 


M0003915SB;GI2 


CH09LNL 




374623 


RTA00002674F.a,0 1 2,P.Seq 


F 


M000391 13D:A06 


CH09LNL 


03 I 


43 1612 


RTA00002669F.e.23 2.P.Seq 


F 


MO0033:o:D:G06 


CH08LNH 


652 




RTA 0000^67^ F * 1Q 1 P 


c 
r 


MOOOjS9^:D:E03 


CH09LNL 


653 


428508 


RTA00002666F.d.OL LP.Seq 


F 


M00032545BH09 


CH0SLNH 


654 


235780 


RTA 00002666 F.d,03. LP.Seq 


F 


M0003254:D;G05 


CH08LNH 


655 


17890 


RTA000027lOF,e, 1 1. LP.Seq 


F 


M0002I955A;H02 


CH03MAH 


656 


20100 


RTA 00002 7 1 0F,g, ILL P.Seq 


F 


M00022rfD:DL2 


CH03MAH 


657 


4458 


RTA00002710F.g.l8.1.P.Seq 


F 


M00022IS-CCII 


CH03MAH 


658 


378347 


RTAOOOO:6S!F.h.07.: 4 P.Seq 


F 


M000398^5D:A10 


CH09LNL 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


I CLONE ID 


LIBRARY 


659 


373477 


RTA00002672F.b.23. l.P.Seq 


F 


M00O38639B:C03 


CHOQI \it 
nu"LiiL 


660 


15596 


RTA0OOO:7IOF.g.02. l.P.Seq 


F 


M00022135CB05 


CHOi.VlAH 


661 


21028 


RTA00002709F.I.09. 1 .P.Seq 


F 


| M00007108B :A02 




662 


374063 


RTA00002672F.h. 1 5.2.P.Seq 


F 


M000390I IDrCIO 


CH091 Ml 


663 


380686 


RTA00OO2684F.a.03 .2.P.Seq 


F 


M00040107B:H07 


CH09LNI 

V* 1 lu/l 111. 


664 


402950 


RTA00002686F.g.lll.P.Seq 


F 


M 000401 3 IB:H09 




665 


428064 


RTA00002665F.K04, ! .P.Seq 


F 


M0003I485D:G02 


CHOSI NJH 
v» nvo i— • » n 


666 


23310 


RTA00002708F,e. 10. 1 RSeq 


F 


M00004046OA08 


CHOI COW 
w nv i v_/ n 


667 


376233 


RTA00002677F,b. 1 5.2.P.Seq 


F 


M00039339CF03 


CH09LNI 


663 


375843 


RTA00002674F.rrh03. 1 .P,Seq 


F 


M00039163C:A04 


CH09LNI 


669 


24225 1 


RTA00002665FJ.08. 1 .P.Seq 


F 


M00028772C;B09 


CH08LNH 

\m* k 4 \J LI 1m till 


670 


374064 


RTA00002672FT. l5.2.P.Seq 


F 


M00038999D:C1 1 


CH09LNI 


671 


146260 


RTA00002663F.d. 1 7. 1 .P.Seq 


F 


M00022099B:D06 


CH03M AH 


672 


375575 


RTA0O002677F.e.22.2.P.Seq 


F 


M00039385B.E09 


CH09LNL 


673 


355518 


RTA0000266 5 F.c. 1 5 . 3 . P, Seq 


F 


MOOO^S^OIBHP 


CH08L\H 


674 


184223 


RTA00002662F,b.08.2,P.Seq 


F 


M00005^39DG0I 




675 


213306 


RTA0O0O2664F.e.07.2.P.Seq 


F 


MO0O' 7 707S A BO 1 


CH04MAI 


676 


429566 


RTA00002668F.b.04. 1 .P.Seq 


F 


M0003~ , 907A G04 


CHOSL\"H 
vnyou^n 


677 


378656 


RTA00002682F.c,09. l.P.Seq 


F 


MOOO 99^ 7 A - F04 


V, nU"L,N L. 


673 


427760 


RTAOOO02668F.e.23. l.P.Seq 


F 


M0003" > 940 A CO 7 




679 


372795 


RTA00002683F.a.06.2.P.Seq 


F 


M0004003' 7 A B03 


CH09I Nil 


680 


429340 


RTA00002666F.f. 12. l.P.Seq 


F 


MOO0 "? ? 5 77 A C04 


^nuoc.in 


681 


429822 


RTA00002668F,e, 1 7, 1 ,P,Seq 


F 


MOO03 -> 939BE07 


PH03I \'H 


682 


375224 


RTA00O0l68OF.d.22.2.P,Seq 


F 


MOOO*97SSRA06 


rWOQI Nil 


683 


378347 


RTA0000268 1 F.h.07. 1 .P.Seq 


F 


M0003987^DAIO 


CH09LNL 


684 


330109 


RTA00002682F,i. \ 7. 1 .P.Seq 


F 


M00039987CG08 


CH09LNI 


685 


379001 


RTA0O002683F.O.02. 1 .P.Seq 


F 


M00040097A:C12 


CH09LNL 


686 


375348 


RTA00002676F.L I2.3.P.$eq 


F 


M00039304D:B09 


CH09LNL 


687 


377889 


RTA00002672F.c.08.2.P.Seq 


F 


M0003866I A:A07 


CH09LNL ' 


683 


429883 


RTA00002667F.g.05. 1 .P.Seq 


F 


M00032793A:F06 


CH08LNH 


689 


377067 


RTA00002682F ,1.24. 1 .P,$eq 


F 


M000400I4B:D01 


CH09LNL 


690 


378001 


RTA0000268 1 ?m?WX P.Seq 


F 


M000398^8D:C06 


CH09LNL 


691 


45298 


RTA00002710F.j.2l. l.P.Seq 


F 


M00022433A.E02 


CH03MAH 


692 


375431 


RTA00002680FT.03. 1 .P.Seq 


F 


M00039793D:C05 


CH09LNL 


693 


377861 


RTA000026S!F.m.20,2.P.Seq 


F 


M00O39893A:A0S 


CH09LNL 


694 


428610 


RTA00002667F.C.09, 1 .P.Seq 


F 


M00032766C:A04 


CH08LNH 


695 


20765 


RTA00002710F.U 0.1. P.Seq 


F 


M00022363CGI2 


CH03MAH 


696 


27601 


RTAOOO0:?l3F,e.23.LRSeq 


F 


M000273I4C:D09 


CH04MAL 


697 


430540 


RTA00002668F.O.20.2. P.Seq 


F 


M00033I40D:F06 


CH0SLNH 


698 


381024 


RTA00002670F.H.23.2. P.Seq 


F 


M00033424B;A04. 


CH09LNL 


699 


16454 


RTA00002709F.f.07 ! P Seq 


F 


M000065 t J9D:B02 


CH02COH 


700 


372893 


RTA00002670F.i.03.2.P.Seq 


F 


M0003J4:4D:H12 


CH09LNL 


701 


37368! 


RTA0000267lF.d.20. l.P.Seq 


F 


M00033272D:F1I 


CH09LNL ! 


702 


32260 


RTA00002684F.h,06.2,P.$eq 


F 


M0004030'B;F01 


CH09LNL 


703 


377343 


RTA00002684F.s.04.2.P.Seq 


F 


M0004030:C:A04 


CH09LNL 


704 


374747 


JlTA0000:676F.e.07.2,P.Seq 


F 


M00039286A:C06 


CH09LNL 


705 


185848 


RTA0O0O27l2F.m. I.I. l.P.Seq 


F 


MOOO:7080A:B01 


CH04MAL 



WO 01/02568 



PCT/US00/18J74 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATIOh 


' CLONE ID 


LIBRARY 


706 


374311 


RTA00002676F.e. 1 8.2.P.Seq 


F 


M00039' 7 8~C' A06 


PHflOf Nil 
V. nu7L.lL 


707 


278923 


RTA00002667F.b.!0.I.P.Seq 


F 


M00032726C:C0I 


CH08I MM 


/08 


378667 


RTA0000263IF.b.ll.2.P.Seq 


F 


M00039847A:F06 


CHOQI Ml 


709 


380454 


RTA00002673F.j.l6.LPSeq 


F 


M00039084D:D07 


CH09I ML 


710 


381576 


RTA00002670Fx04.2.P.Seq 


F 


M00033425A:C10 


CH09LNL 


711 


375067 


RTA 00002675 F.o.03 . ! ,P.Seq 


F 


M00039^60C G03 


! CH091 MT 


712 


89706 


RTA 000027 ]4F.a. 11. 1. P.Seq 


F 


M000 J) 774l 3 F09 


CH04MAI i 


713 


10583 


RTA000027iiFJU U.P.Seq 


F 


M00023100A:E12 


CH03iV!AH 


714 


379982 


RTA00002682F.U6.I.P.Seq 


F 


M00039987C.E12 


CH09LNL 


715 


378532 


RTA00002680Fn.043.RSeq 


F 


M000398' 1 SB'C05 


CH09LNL 


716 


379776 


RTA00002680F.a.22.2. P.Seq 


F 


M00039774C A 03 


! CH09LNL 


717 


374136 


RTA00002673F.f. 16.1. P.Seq 


F 


M00039072C:C03 


CH09LNL 


718 


98471 


RTA00002663F.j.2 1.1. P.Seq 


F 


M00022670D:HI 1 


CH03MAH 


719 


125365 


RTAO00O2668F.j.07_ 1 P.Seq 


^ F 


M0O033O(9B:EI0 


CH08LMH 


/20 


375431 


RTA00002680F.fl03 2.PSeq 


F 


M0003979^DC05 


CH09LNL 


721 


62826 


RTA0000266 1 F.g.20. 1 P.Seq 


F 


^10000410^0 00^ 


v. riu i \- \_/ n 


722 


379972 


RTA00002679F.eJO. I .P.Seq 


F 


MOOO'967^0-010 


CH09LML 


723 


377554 


RTA00002679FX 1 0, 1 , P.Seq 


F 


MOO0"59675D B03 


CH09f Ml 

VnU7LllL 


/24 


230479 


RTA00002664F.cJ62.PSeq 


F 


M0OO" 1 6915BCO6 

l 7 IWWW«»W' 1 * V_ WW 


CH04MAL 




98872 


RTA00002663Fj.l9.LP.Seq 


F 


M000^668S BP 


CH01MAH 

v. nu j ivi»f%ri 


Jib 


42635 


RTA00002679F.h, 1 3. 1 .P.Seq 


F 


M00039684D:B08 


CH09LNI 


727 


379044 


RTA00002679F.a. !0.2.P,Seq j F 


M00039652B:D05 


CH09LNL 


728 


96093 


RTA00002663FJ.07. 1. P.Seq 


F 


M000^640C CP 


CH03MAH 


729 


403642 


RTA000026S7F.d.0L2,P,Seq 


F 


M0003994*CF09 

If IVvvJ ' * » V* • 4 \J ' 


CH I4EDT 


730 


400921 


RTA0OOO2685 F.b. 1 8.2.P.Seq 


F 


M000J9371B:H06 


CHI2EDT 


731 


93587 


RTAOO002663 F.k. 1 0. t P.Seq 


F 


M0002273 1 A.D02 


CH03MAH 


732 


79951 


RTA000027l3FclS,LRSeq 


F 


M00027258A:A07 


CH04MAL 


733 


176509 


RTA00002686F.b.09. 1 .P.Seq 


F 


M00039756B:H06 


CHI3EDT 


734 


451753 


RTA00002694F.e.06. 1, P.Seq 


F 


M00043634A:Cl0 


CH^OCOHLV 


735 


186266 


RTA000027I3F.C. 16.1. P.Seq 


F 


M00027256B:H09 


CH04MAL 


736 


235052 


RTA00002692F.3. 1 5.2.P.Seq 


F 


M00042626B:D08 


CHI SCON 


737 


377233 


RTA00002632F.e.23. 1 .P.Seq 


F 


M00039940D:G08 


CH09LNL 


738 


37*532 


RTA000026SOF.n.04.2.P.Seq 


F 


M00039328B:C05 


CH09LNL 


739 


177932 


RTA000027l3F.b.22.L P.Seq 


F 


M00027233B:C0! 


CH04MAL 


740 


9332 


RTA00002712F.p.l8I.P.Seq 


F 


M00027!79D:E06 


CH04MAL 


74} 


240318 


RTA00002687F.d.04.2.P.Seq 


F 


M00039947A:D06 


CHI4EDT ! 


742 


404260 


RTA00002637F.C.I L2.P.Seq 


F 


M00039942D:C01 


CHI4EDT 


743 


93767 


RTA000027 1 2F.g.09. 1 .P.Seq 


F 


M00026S68C:E11 ! 


CH04MAL 


744 


185642 | 


RTA000027 1 2F.f.20. 1 .P.Seq 


F 


M00026856D:FO2 


CH04MAL 


745 


447544 


RTA00002639F.C 1 8.3. P.Seq 


F 


M00042905D-D02 


CH15CON 


746 


403274 


RTA00002687F.b.l0.2. P.Seq 


F 


M00039766A:G07 


CH14EDT 


747 


404257 


RTA00002687F.g.06.1.P.S*q 


F 


M0004020SA:C03 


CHI4EDT 


748 


403868 


RTA00002687F.k.05.2. P.Seq 




M000403|SC:K! 1 


CH14EDT 


749 


450074 


RTA00002691F.fi. 12.2. P.Seq 




M000433^2D:C1 1 


CHI7COHLV 


750 


404520 


RT.\00002687F.f:05.2. P.Seq 




MO0040:O2A:FO5 


CHI4EDT 


751 


451789 


RTA00002692F.b.04.2. P.Seq 




M00042956C:B06 


CHI3C0N 


752 


455173 


RTA00002694F.b. 19-1- P.Seq 




M00043447A.C07 


CH20COHLV 



11 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORJENTATION 


f CLONE ID 


LIBRARY 


753 


455136 


RTA00002694F.a.08. 1 P,Seq 


F 


I M00042595A.B01 


CH20COHLV 


754 


379001 


RTA00002683F.O-02.2-P.Scq 


F 


M00040097A:C12 


CH09LNL 


755 


374763 


RTA00002673Fp-2 1 . 1 .P.Seq 


F 


MOOQ3911SB:C05 


CH09LNL 


756 


402508 


RTA00002686F.O, 15. t .P.Seq 


F 


M000402SID:BOI 


CH13EDT 


757 


431370 


RTA00002669F.rn.04J. P.Seq 


F 


M00033288B.D12 


CH08LNH 


758 


380500 


RTA00002670F.p. 1 9. 1 .P.Seq 


F 


M00033583B:E06 


CH09LNL 


759 


376743 


RTA00002678F.e-22.2.P.Seq 


F 


] M0003946IA:F04 


CH09LNL 


760 


191690 


RTA00002673F.m. 1 9. 1 .PSeq 


F 


M00039107C:EQ4 


CH09LNL 


761 


374264 


RTA 0000267 1 F.p.2 1 .2.P.Seq 


F 


M00038620B:E09 


CH09LNL 


762 


373020 


RTA0000267I F.b.20.2.PSeq 


F 


M00033595A:CH 


CH09LNL 


763 


375231 


RTA0000267 1 F.m. 20.2. P.Seq 


F 


M00038387B:A07 


CH09LNL 


764 


16180 


RTA00002709F.j.l7.I.P.Seq 


F 


M00006977D:A03 


CH02COH 


765 


379403 


RTA00002683Fx. 1 7.2.P.Seq 


F 


M00040041C:C09 


CH09LNL 


766 


375382 


RTA00002677F.d.24.2, P.Seq 


F 


M000393SID:C02 


CH09LNL 


767 


379653 


RTA00002683 F.c,03 .2, P.Seq 


F 


MOO040O33D:GO4 


CH09LNL 


768 


377858 


RTA00O0268!F.e.l4.2.P.Seq 


F 


M00039864A:A07 


CH09LNL 


769 


430861 


RTA00002668F.h.l8.I.P.Seq 


F 


M00032995CC05 


CHOSLNH 


770 


376128 


RTA00002677F.a.l 1.2. P.Seq 


1 F 


M00039334B:E03 


CH09LNL 


771 


375009 


RTA00002676F.n.20.2.P,$eq 


F 


M00039322A:F04 


CH09LNL 


772 


429816 


RTA00002667F.n.22. L P.Seq 


F 


M0003287ID:E!I 


CHOSLNH 


773 


375657 


RTA0000268IF.h.l3 2.PSeq 


F 


M00039877C:C03 


CH09LNL 


774 


427889 


RTA00002666F.b. 14. 1 .P.Seq 


F 


M00032530D:C02 


CHOSLNH 


775 


376761 


RTA 00002677F.g.03.2. P.Seq 


F 


M0003939ID:F08 


CH09LNL 


776 


44025 


RTA000026S4F.b.24.2.P.Seq 


F 


M000401 153:A04 


CH09LNL 


777 


44025 


RTA00002684F.C.0 1 2.P.Seq 


F 


M000401 15B:A04 


CH09LNL 


778 


392524 


RTA0000268 1 F.p.04.2.P.Seq 


F 


M00039909D:C02 


CH09LNL 


779 


427252 


RTA00002665F.b. 13.1 P.Seq 


F 


M00028185B;A06 


CHOSLNH 


780 


374927 


RTA00002673F.e.l2.LPSeq 


F 


M0003906SC:E06 


CH09LNL 


781 


373226 


RTA00002680F,g.09.l. P.Seq 


F 


M00039797CC05 


CH09LNL | 


782 


217964 


RTA00002664F.g.08.2.P.Seq 


F 


M00027299B:B12 


CH04MAL 


783 


376368 


RTA00002677F.b. 14.2.PSeq 


F 


M00039339A.H07 


CH09LNL 


784 


377719 


RTA00002677F.j,ll.2.P.Seq 


F 


M00039407B:G02 


CH09LNL 


785 


378081 


RTA00002677F.e.l6.2.PSeq 


F 


M00039384C:E02 


CH09LNL 


786 


89267, 


RTA00002662F.b.OI.2.PSeq 


F 


M00005445D:B01 


CH02COH 


787 


374927 


RTA00002673F.e.l2.2.P.Seq 


F 


M0003906SC:E06 


CH09LNL 


788 


279054 


RTA0O002667F,b-23. ! ,PSeq 


F 


M00032731B:C10 


CHOSLNH 


789 


377283 


RTA00002682F.m. 1 9. 1 .P.Seq 


F 


M00040016C:H12 


CH09LNL 


790 


45318 


RTA00002710F.I.05.1. P.Seq 


F 


M00022533A:A08 


CH03MAH 


791 


1 88292 


RTA00002664F.e.23.2.P,Seq 


F 


M00027I62B:F05 


CH04MAL 


792 


378S72 


RTA00002683F.c.20.2.P.Seq 


F 


M00040042B;A10 


CH09LNL 


793 


427252 


RTA00002665F.b. l3.: v P.Seq 


F 


M00023lSfB:A06 


CHOSLNH 


794 


380618 


RTA00002673F.j.!2.2.P.Seq 


F 


M000390S4CG07 


CHO^LNL 


795 


35646 


RTA00002667F.g.l6.!.P.Seq 


F 


M00032797B:G02 


CHOSLNH 


796 


46407 


RTA0OOO:665F.c.l0.3.PSeq 


F 


M0002S196D:A03 


CHOSLNH 


797 


373720 


RTA0OOO:674F.c,04. L P.Seq 


F 


M00039!24C:F03 


CH09LNL | 


798 


429693 


RTA0000;668F.f.05. 1. P.Seq 


F 


M00032944B:B02 


CHOSLNH 


799 


377108 


RTAOOOO:678F.p.04.2. P.Seq 


F 


M00039636C:D1 1 


CHOUNL 



is 



WO 01/02568 



PCT/USOO/18374 



SEQ 












ID 


CLUSTER 


SEQ NAME 


ORIENTATlOt 


* CLONE ID 


LIBRARY 


800 


375657 


RTA0000268 1 F.h. 13,1 .PSec 


F 


M00039877C:C03 


CH09LNL 


801 


374868 


RTA00002673F.d08.2.P.Seq 


F 


M00039063B:D08 


1 CH09LNL 


802 


428716 


RTA00002667F.e.08. LP.Seq 


F 


M00032766B:DI2 


CH08LNH 


803 


1 44025 


RTA00002684Rc.0L LP.Seq 


^ F 


M00040!!5B:A04 


CH09LNL 


804 


430327 


RTA000Q2668F.U LI. P.Scq 


F 


J M0O033O33CHO1 


1 CH08LNH 


805 


374328 


RTA00002673F.C.24. LP.Seq 


F 


M0003906IB.F08 


CH09LNL 

14V/ 4 ™ 


806 


376946 


RTA000O2682F-n. 1 0. 1 .P.Seq 


F 


M000400I9A:E01 


CH09LNL 


807 


375522 


RTA00002677Fn.08JlP.Seq 


F 


M00039420D:D03 


CH09LNL 


808 


395617 


RTA00002687F.b. 1 5.2.P.Seq 


F 


M00039767B:A04 


CHI4EDT 


809 


21686 


RTA000027l2F,g.05.LP,Seq 


F 


M00026365B:A06 


CH04MAL 


810 


452038 


RTA00002692F.a.09.2.P.Seq 


F 


M00042623D:D07 


CHI SCON 

^-w ill \Sl ^ 


811 


25632 


RTA0000271IF.g.l6.LP.$eq 


F 


M0O02304'»D:DO2 


! CH03MAH 


812 


j 152487 


RTA00OO2663F,e. 12, LP.Seq 


F 


M0002218IC.DOI 


CH03MAH 


813 


378226 


RTA00002680F.g,09.2.P.Seq 


F 


M00039797C:G05 


CH09LNL 


814 


402446 


RTA00002686F.C.04. 1 iRSeq 


F 


M00040I33B.B03 


CHI3EDT 


815 


403642 


RTA00QO2637F.c.24.2.P i Seq 


F 


M00039945CF09 


CHI4EDT 


816 


186359 


RTA000027 1 3F,g.24, 1 .P.Seq 


F 


M00027379C.B07 


CH04MAI 


8!7 


404290 


RTA00002688F,e,04.2.P.Seq 


F 


M00040395B:DI I 


CH14EDT 


818 


375443 


RTA00002676F.g A 9.2.P.Seq 


F 


M00039298B:D03 


CH09LNL 


819 


380279 


RTA00002673F i.24. 1 P.Seq 


F 


M00039082BA05 


CH09LNI 


820 


386110 


RTA00002687F.e,06. 1 PSeq 


F 


M00039955CC04 


CH14EDT 


821 


380279 


RTA00002673F.J.0 1. LP.Seq 


F 


M00039082B:A05 


CH09LNL 


822 


386986 


RTA00002675F.P.06. LP.Seq 


F 


M00039266A:302 


CH09LNL 


823 


186359 


RTA000027 1 3 F.h.O L LP.Seq 


F 


M00027379C:B07 


CH04MAL 


824 


375611 


RTA00002677F.o.20.2.P.Seq 


F 


M00039425D:E12 


CH09LNL 


825 


378285 


RTA00002679F.h.O I . LP.Seq 


F 


M0003968IB:H09 


CH09LNL 


826 


44025 


RTA00002684F6.24. 1 .P.Seq 


F 


M0OO40115B:AO4 


CH09LNL 


827 


25240 


RTA0000271 1 Fx. 12. LP.Seq 


F 


M00022854A:B03 


CH03MAH 


828 


403700 


RTA00002687Fg.03.2-P.Seq 


F 


M00040207B:D08 


CHI4EDT 


829 


404679 


RTA00002687F.F.07. LP.Seq 


F 


M00040203A:H06 


CHI4EDT 


830 


454806 


RTA00002693 F.b. 1 2.2.P.Seq 


F 


M00043093C:G!1 


CH19COP 


831 


376829 


RTA00002674F.r;2 1 2,P,$eq 


F 


M00039I35D:G02 


CH09LNL 


832 


456309 


RTA00002694F.J. 16. LP.Seq 


F 


M000435I8B:D06 


CH20COHLV 


833 


374510 


RTA00002672F.i. 1 7.2.P.Seq 


F 


M000390I5D:H04 


CH09LNL 


834 


377232 


RTA00002683F.m.08.2.P.Seq 


F 


M00040090B:G09 


CH09LNL 


835 


375779 


RTA00002672Fj_20.2.P.Seq 


f 


M00039025A:H09 


CH09LNL 


836 


90746 


RTA0000267IF.a.07.2.P.Seq 


F 


M00033585D:A02 


CH09LNL 


837 


453002 


RTA00002692F.b i 2 1 .2.P.Seq 


F 


M00042970CH10 


CHI SCON 


838 


402863 


RTA00002686F.n. 1 2. 1 .P.Seq 


F 


M00040273B:H12 


CH13EDT 


839 


402526 


RTA00002686Fp.07 4 LP.Seq 


F 


M00040286C:C02 


CHI3EDT 


840 


412778 


RTA00002685F.i.07. 1 P.Seq 


F 


M00039533D:F04 


CH12EDT ! 


84! 


402273 


RTA00002686F 18. L P.Seq 


F 


M00O40233CG05 


CHI3EDT 


842 


374744 


RTA00002670F.U 6. LP.Seq 


F 


M00033427D:F0I 


CH0OLNL 


843 


375764 


RTA00002677F.o.l8.2.P.Seq 


F 


M0OO39425C.G0I 


CH09LNL 


844 


428218 


RTA00002667F.C.OL LP.Seq 


F 


M00032731C:C07 


CH08LNH 


845 


374S09 


RTA00002675F.h.O L LP.Seq 


F 


M00039230D:D09 


CH09LNL 


846 


20162 


RTA00OO2710F.n.2O. LP.Seq 


F 


M00022662D:G1 ! 


CH03MAH 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATE 


4 CLONE ID 


LIBRARY 




375782 


RTA00002677F.d.23.2>P.Seq 


F 


M000393S;C:H08 


CH09LNL 


843 


372958 


RTA00002672F.C.02. 1 .P.Seq 




M00038639D:F07 


CH09LNL 


849 


403940 


RTA00002688F.d.07.2.P.Seq 




M0004038"D:H05 


! CHI4EDT 


850 


8490 


RTA0000271 1 F.g.03. 1 .P.Seq 


F 


M00023020CG08 


CH03MAH 


851 


374809 


RTA00002675F.g.24 1 .P.Seq 


F 


M00039230D:D09 


CH09LNL 


852 


377788 


RTA00002684F.g.24.2.P.Seq 


F 


M000403O5C;H06 


CH09LNL 


853 


13847 


RTA0000271 IF. f.09.|. P.Seq 


F 


j M00022976C:F04 


CH03MAH 


854 


374172 


RTA00002673F.k. 16.1 .P.Seq 


F 


M00039097D:D06 


CH09LNL 


855 


380314 


RTA00002682F.I.07. 1 .P.Seq 




M00040009D-B07 


CH09LNL 


856 


47231 


RTA00002714F.b.I5.l. P.Seq 


F 


M000278i3C:F01 


CH04MAL 


857 


400287 


RTA00002685F.k, 10,1. P.Seq 


F 


M0003958^C:COI 


CHI2EDT 


858 


400533 


RTA00OO2685F.a.O2.2.P.Seq 


F 


MOOO391S!D:E05 


CH12EDT 


859 


447594 


RTA00002689F.C.07. 1 .P.Seq. 


F 


M000426963:E05 


CH15CON 


860 


147357 


RTA000027 1 1 F.e. 1 5. 1 .P.Seq 


F 


M00022923B:COI 


CH03MAH 


861 


401 141 


RTA00002685F,o.22:2.P.Seq 


F 


M00039642D:BI2 


CH12EDT 


862 


404620 


RTA00002637FX.03 . 1. P.Seq 


F 


M00039770A:G!I 


CHI4EDT 


863 


24360 


RTA00002709F.I.20. 1. P.Seq 


F 


M00007I49A:G02 


CH02COH 


864 


380613 


RTA00002673F.j.!2.1.P.Seq 


F 


M00039O8-C:G07 


CH09LNL 


865 


448446 


RTA00002690F.d.09.3.P.Seq 


F 


M00042797D:DI0 


CHI6COP 


866 


402313 


RTA00002686F. f. 1 S. I .P.Seq 


F 


M0004017^D:G03 


CH13EDT 


867 


27315! 


RTA0Q0O26S5F.c.Q5.2-P.Seq 


F 


M00039374CH02 


CH12EDT 


868 


404172 


RTA00002687F.d. 1 7.2.P.$eq 


F 


M00039951B:BI2 


GHI4EDT 


869 


263630 


RTA00002694F.e. 1 0. 1 .P.Seq 


F 


M00043637C:H0! 


CH20COHLV 


870 


404277 


RTA000026S7F.dJ 3, ! .P.Seq 


F 


M0003995iB:C03 


CH14EDT 


871 


403557 


RTA00002637F.d. 1 0. 1 P.Seq 


F 


M00039943A:E03 


CH14EDT 


872 


375161 


RTA00002676F.m.24.2.P.Seq 


F 


M000393l9B:Hi2 


CH09LNL 


873 


376829 


RTA00002674F f,2 1 . 1. P.Seq 


F 


M00039!35D:G02 


CH09LNL 


874 


372953 


RTA0OOO2672F.C.02.2. P.Seq 


F 


M00038639D:F07 


CH09LNL 


875 


21578 


RTA0O0027Q9F.a.24. 1 .P.Seq 


F 


tMO0O0535lC:GO5 


CH02COH 


876 


402506 


RTA00002686F.b. 17.1. P.Seq 


F 


M00039760B:B08 


CHI3EDT 


877 


14173! 


RTA000027l3F.b.04.L P.Seq 


F 


M00027212D:E03 


CH04MAL 


878 


3741! 


RTA00002661F,eJ U, P.Seq 


F 


M00O0377OA:E05 


CHOICOH 


879 


372537 


RTA00002670F.c.05.2.P,Seq 


F 


M00033345D:A09 


CH09LNL 


880 


380834 


RTA00002670F.c.OS.2.P.Seq 


F 


M00033346CA05 


CH09LNL. 


881 


401492 


RTA00002685F.n.l7.2.P.Seq 


F 


M00039609D;F07 


CH12EDT 


882 


99998 


RTA0G002662F.b.23.2. P.Seq 


F 


M000067I2C:H09 


CH02COH 


883 


4043 1 1 


RTA000O2633F.d.2 1 .2.P.Seq 


F 


M00040394A:D04 


CH14EDT 


884 


231084 


RTA00002664F.C. ! S.2.P.Seq 


F 


M000269!SB:DOI 


CH04MAL | 


885 


447679 


RTA00002689F.b.II.3.P.Seq 


F 


M00042560A:FI2 


CHI SCON 


836 


377012 


RTA00002682 F.d. 1 7. LP,Seq 


F 


M00039936C:C05 


CH09LNL 


887 


226207 


RTA00002664F.d,2 1 .2.P.Seq 


F 


M0002703f D:C06 


CH04MAL 


888 


446183 


RTA000026S9F.3. 1 2. 1 .P.Seq 




M0004253-iA:A05 


CHI5CON 


889 


428508 


RTA00002666F.C.24.1. P.Seq 




M00032545B:H09 


CH08LNH 


890 


157643 


RTA000027l4F.b.20.l. P.Seq 




M00027S1SCC07 


CH04MAL 


891 


404609 


RTA000026SSF.b. IS.2.P.Seq 




M00040377C.G07 


CHI4EDT 


892 


400464 


RTA00002685F.!.IO.I.P.Seq 




iM00039590D:D02 


CH12EDT 


S93 


379108 


RTA000026S5FJ. 12.1. P.Seq 




M0003959iC:D06 


CHI2EDT 



«&7 



WO 01/02568 



PCT/US0O/IS374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORJENTATlOl^ 


f CLONE ID 


LIBRARY 




374639 


RTA00002676F,d.2 1 ,2,P.Seq 


F 


M00039284D:BI2 


CH09LNL 


895 


380674 


RTA00002673F.j.U,2.P.$eq 


F 


M00039084CH04 


CH09LNL ! 


896 


380674 


RTA0OO02673F,j.U.l.p.Seq 


F 


M00039084C:H04 


CH09LNL 


897 


188972 


RTA00002664F.d.20.2.P.Seq 


F 


^ M00027030CH06 


CH04MAL 


898 


402835 


RTA00002686F,c.0U. P.Seq 


F 


M00040I3ID:C08 


CH13EDT 


899 


403774 


RTA00002687F,d.08.2.P.Seq 


F 


[ M00039947CG03 


j CH14EDT 


900 


374606 


RTA0OO02673F.j,23.2.P.Seq 


F 


M00039096A.A05 


CH09LNL 


901 


192535 


RTA00002663F.m. 14, 1 ,P,Seq 


F 


M00022925C:A08 


CH03MAH 


902 


377926 


RTA00002680F.L 1 6.2.P.Seq 


F 


M000398208:B06 


CH09LNL | 


903 


1 86055 


RTA00002712Fj.ll.LP.Seq 


F 


M00026926A:E10 


CH04MAL 


904 


380498 


RTA00002684F.f.lL2_P.Seq 


F 


M00040!29D:E!0 


CH09LNL 


905 


400236 


RTA00002685F.i. 1 8.2.P.Seq 


F 


M00039561A:B07 


CH12EDT 


906 


401070 


RTA00002688F.d,12.2.P.Seq 


F 


M00040390A;H02 


. CHI4EDT 


907 


452622 


RTA00002692F.6- I4.2,p_Seq 


F 


M00042962D:C05 


CH18CON 


908 


235052 


RTA00002692F.X 1 5;| .P.Seq 


F 


M00042626B:D08 


CHI SCON 


909 


452221 


RTA00002692F.C 1 3.2. P.Seq 


F 


M00042936C:C!2 


CHI SCON 


910 


404581 


RTA00002687F.2. 1 \ .2.P,Seq 


F 


MO00402O8D:GO9 


CH14EDT 


911 


376925 


RTA00002687Fe.I4.2.P.$eq 


F 


M00039957CC09 


CH14EDT 


912 


400287 


RTAG0002685F.k.l0.2,P.Seq 


F 


M00039584C:C01 


CHI2EDT 


913 


403242 


RTA00002687FJ.05.1RSeq 


F 


M00040323B:CI2 


CH14EDT 


914 


453313 


RTA0GOO2693F.a.07.2>P.Seq 


F 


M000426!4B:B05 


CH19COP 


915 


452633 


RTA00002692FX 11 2.RSeq 


F 


M00043067D:D10 


CHI SCON 


916 


447679 


RTA000026S9F.b. 11.1. P.Seq 


F 


M00042560A:F12 


CH15CON 


917 


452398 


RTA00002692F.r;i7.LP.Seq 


F 


M00043125C:A1 1 


CHI SCON 


918 


449797 


RTA00002691F,b.22.3.P.Seq 


F 


M0O043334B:AI0 


CHI7COHLV 


919 


403916 


RTA00002637F.j.ll.2.R$eq 


F 


M00040314D:H05 


CH14EDT 


920 


236906 


RTA00002693 F.d.05 .2. P.Seq 


F 


M00043I54A:B07 


CH19COP 


921 


404161 


RTA00002687F.e.20 i 2.RSeq 


F 


MOO039958CBO9 


CHI4EDT 


922 


386110 


RTA00002687F.e,06,2.P.Seq 


F 


M00039Q55C:C04 


CHUEDT 


923 


451512 


RTA00002691 Fb.02.3.P.Seq 


F 


M000433058:C02 


CH17COHLV 


924 


400517 


RTA00002687F.k.i5.2.P.Seq 


F 


M00040320D:F02 


CHI4EDT 


925 


403578 


RTAO0O02687F.i.0l.2.RScq 


F 


M00040296D:E09 


CHI4EDT 


926 


403578 


RTA00002687FA24.2.P.Seq 


F 1 


M00040296D.E09 


CHUEDT 


927 


403371 


RTA00002687FH. 19,2.RSeq 


F 


M00040294D:D12 


CHUEDT 


928 


452531 I 


RTA00002692F,f.l6.LRSeq 


F 


M00043I25A:BU 


CHI SCON 


929 


454453 


RTA00002693 F.f. 1 5.2.R5eq 


F 


MOO0432!5A:D02 


CH19COP 


930 


238270 


RTA0O002692F.e.O7.2.P.Seq 


F 


M0004302SA:G05 


CHI SCON 


931 


14583 


RTA00002637F.f.08.2.P.Seq 


F 


M00040203B:A05 


CHUEDT 


932 


400464 


RTA00002685F.1.10.2.P.Seq 


F 


M00039590D:D02 


CH12EDT 


933 


404642 


RTA00002687F.f.02.2.P.Seq 


F 


M000^020lC:Gl I 


CHUEDT 


934 


380413 


RTA00002680F.1C. 19.1. P.Seq 


F 


M000j9S16C:D05 


CH09LNL 






O T i AAAA^£A^ r* _ ia ^ r\ r* 

K I A00U0^69jF 4 c i 20.2 : P.Seq 


F 


M00043148C:A09 


CHI9COP 


936 


20847 


RTA000027l0F.d.09.1. P.Seq 


F 


M000:i852D:A05 


CH03MAH 


937 


456531 


RTA00002694F.6. 18.1. P.Seq 


F 


M00043446C:E!2 


CH20COHLV 


938 


450463 | 


RTA0OO0:694F.a.l2.LRSeq 


F 


M00042596C:D07 


CH20COHLV 


939 


456713 


RTA00OO2694F,d. 13.1 ,RSeq 


F 


M000435I3D:GOS 


CH20COHLV 


940 


455508 


RTA00002694F.a.l5.LP.5eq 


F 


M00Q425^7B:EI2 


CH20COHLV 



3<z 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


? 

CLUSTER 


SEQ NAME 


[ORIENTATIQ 


CLONE ID 


LfBRARY 

CH09LNL 
CH13EDT 
CH09LNL 
1 CH09LNL ~ 
CH12EDT 
CH08LNH 
CH16COP 
CH12EDT 
CH19COP 


941 
942 
943 
944 
945 
946 
947 
943 
949 
950 


376138 
40283 1 
373820 

OJ JOO 

400732 
! 431629 
449349 
401124 
453233 
124813 


" RTA00002674F.m.05.2.P.Se 
™ RTA00002686F.m.03.I.RSe~ 
RTAO00O2674F.d.06.2 i RSe( 
K 1 AUUUU2o74F,C.06.2.P.Sec 
RTA00002685F.k.24,2.RSec 
RTA00002669F.1. 1 4. 1 .RSeq 
RTA00002690F.d.l2.3.P.Scq 
RTA000O2685F.OJ l.2.P,Seq 
RTA00002693F.a.01.2.RSeq 
RTA00002685FJ. !0.2.RSeq 


q F " 
q F 

rr f 

I F 
1 F 
T F 
1 F 
I F 
! F 


| M00039I69A:EI2 
! M00040264D.C05 

M00039I27A:GU 
| M0003912aC:H08 

M0003958-C:F12 
f M000332763;G08 
| M00042802CC04 

M00039629D:B04 

M0004261 1A:A06 


951 
952 
953 
954 
955 
956 
957 
958 
959 


454627 
169464 
451654 
406092 
453501 
450845 

ddR 1 77 

402617 
378014 


RTA00002693F.f.09.2.RSeq 
RTA00002663 F.L 1 9. 1 .RSeq 
RTA00002692 F. f.02 .2. RSeq 

RTA00002685F.k.l l.2.RSea 

dta nnDn*) aoi c a i r> c 
In 1 1\ VUUU.£ OV J r, (J, 1 .r.Seq 

RTA0O00269!F.fJ0.|.R$eq 
K 1 AUUUUJoyUr.e. ll> 1 .P.Seq 
RTA00002686F.b.2 1 . 1 .RSeq 
RTA0OOO26S0F.*. 1 7. 1. RSeq 


I F 
1 F 

I F 
1 F 

tr f 

J F 
I F 
i F 

i f 

! F 


I M00039564B:C01 
1 M00043210C:E05 
f M00022602A:E09 
I M0OO43O44D.A09 
1 M00039584CC11 
I M00043I62D:CI2 
j iVI000434IOC:A09 
I M00042839B:BII 
1 M00040I3IB.D1 1 
1 M00039799 AD 10 


CH12EDT 
CH19COP r 
CH03MAH 

CHI2EDT 

CHI9COP 

CHI7COHLV 

CHI6COP 
run.cnr 1 

vn UCUI J 


960 
96! 


1248 13 
29450 


RTA0O002685F.j.l0.I.RSec} 
RTA00002663F.d.07. 1 RSeq 


F 

1 F 


M00039*6-iB-COI 




962 
963 


400486 

44753 


RTA0O002685F.e.02. 1 .P.Seq 
RTA000027 1 3FT.05- 1 ,RSeq 


F 


f MOOO^O ^ 4 A * HO "5 
M000394963:D08 


run'! v.i \ lj I 

PHnpnT I 
n i -cu i j 


964 


448177 


RTA00002690F.fi. 1 2.2.RSeq 


F 
F 


M00027324D:C05 
M000428j9B:BI 1 


PHfiJ.!V*1 A t 1 

PHi^rnp 1 


965 


447697 


RTA00002689F.C. l5.3.RSeq 


F 


M0004~ > 90* A Fl I 


cu i >rn\i I 

\_ n i «?v-\JiN | 


966 


240318 


RTA00002687F.d.04. l.RSeq 


F 


M 0003 994 7 A : D06 


fLjiipriT 1 


967 


451620 


RTA 0000269 IF.d20.3.RSeq 


F J 


M00043379D:H02 


CH 1 7COW1 V 
vn i i v. \j riL v | 


968 


400157 


RTA00002685F.i,20.2.P.$eq 


F | 


M0003956I3:A09 


CHi^Fnr 1 


969 


400276 


RTA00002685F.h. I6.2.RSeq 


F ! 


M00039528B:312 


fHPFnr ! 


970 


449779 


RTA0000269IF.d04-3.RSeq 


F 1 


M00043367B;A08 




971 


400157 


RTA00002685F.L20. l.RSeq 


F ! 


M00039561B:A09 


CHI^FHT 1 


972 


238133 


RTA00002685 F.e.03 .2. RSeq 


f f 


M00039496B:H09 


fNPFnT 1 


973 


452015 


RTA00002692F.c.07,2.RSeq 


F 


M00042981B:DI 1 


CH \ SCO\ 


974 


400732 


RTA00002685F.I.0 1 .2.RSeq 


F j 


M00039587CFI2 


CHPRDT 1 


975 


24984 


RTA000027 1 1 F.d,2 1 . 1 . P.Seq J 


F 


M000229IOA:A06 


vn i I | 

CHO'MAH 


976 


449040 


RTA00002690F.e.I4.2.RSeq 


F 


M0004284|D:H07 


CHI6COP 


977 


377481 


RTA0000267lF.U5.3.RSeq 


F f 


M0OO383O3A;CO3 


CH09LNL 


978 


400910 


RTA000026S5F.b.07 IRScq [ 


F [ 


M00039J67B:H02 


CH12EDT 1 


979 


376945 


RTA00002682F.k.23. l.RSeq 


F 


MO0O40O07D:A06 


CHO^LNL 1 


980 


15906 


RTA00002709F.C. 14. l.RSeq 


F 


M0000530-"D;D12 


CHO^COH 1 


981 


452781 


RTA00002692F.b.l6.2.P.Seq 


F f 


M00042966B:F07 


CHI SCON 


9R^ 

70*. 


A I OQ 1 
4 1 J_V4 


K 1 A000u2686h.t. 1 4.1. P.Seq | 


F 


M00040173D:B05 


CHI3EDT I 


983 


401644 


RTA00OO2685F.n.l6j ; RSeq 


F 


M0003960SD;HOI 


CHI2EDT 


984 


404402 


RTA000O2687F.a.!9.2.RSeq 


F 


M00039761D:E10 


CHI4EDT 


985 


401709 


RTA0O0O:685F.n.24.:,RSeq 


F 


M00039624A:H09 


CHI2EDT 


986 


401644 1 


HTA00002685F.nJ6.2.RSeq 


F 


M0003 9608 D: H01 


CH12EDT 


987 


452531 


RTA00002692F.f.l6.2.P.Seq ( 


F 


M00043125A:3I I 


CHI SCON ! 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


983 


400910 


RTA00002685F.b.07.2.P.Seq 


F 


M00039367B:H02 


CH12EDT 


989 


449235 


RTA00002690F.a,22.3.P.Seq 


F 


M00042439B.B03 


CH16COP 


990 


449794 


RTA0000269lFc.22.2.P.5eq 


F 


M00043361B:A01 


CHI7COHLV 


991 


400921 


RTA00002685F.b.l8.LP.Seq 


F 


M00039371B:H06 


CHI2EDT 


992 


373874 


r RTAQ0002672F.c.22.2.P.Seq 


F 


M00038663D;H10 


CH09LNL 


993 


401050 


RTA00002685F.e.09.2,P.Seq 


F 


M00039499C.A04 


CH12EDT 


994 


453237 


RTA00002693F.c.02.2.P.Seq 


F 


M00043108A.F06 


CH19COP 


995 


449294 


RTA00002690F.C. 1 3.3.PSeq 


F 


M00042770CC04 


CH16COP 


996 


404260 


RTA00002687Fc.ll.LP.Seq 


F 


M00039942D.C01 


CHI4EDT 


997 


37801 4 


RTA00002680F^ 1 7.2,P.Seq 


F 


M00039799A.D10 


CH09LNL 


99S 


404726 


RTA00002688F.a. 1 8.2.PS*q 


F 


M0004037IC:H05 


CHI4EDT 


999 


451347 


RTA0000269 1 F.b. II J.P.Seq 


F 


M000433IIC:E03 


CH17COHLV 


1000 


401154 


RTA00002685F.e.06.2.P.Seq 


F 


M00039497CC06 


CH12EDT 


1001 


401870 


RTA0OO02686F.b.22. 1 .P.Seq 


F 


M00040I3IC:F03 


CHI3EDT 


.1002 


400170 


RTA00002685F.b.03.2.P.Scq 


F 


M00039366CB07 


CH12EDT 


1003 


25387 


RTA000027llF.f.l9J.P.Seq 


F 


M00023001C:C08 


CH03MAH 


1004 


377085 


RTA00002673F.n. 14.1. P.Seq 


F 


M000396I9B:D02 


. CH09LNL 


1005 


403530 


RTA00002688F.a.09.2.P.Seq 


F 


M00040363A:F01 


CHI4EDT 


1006 


372930 


RTA000O2670F.j.l2.2.P.Seq 


F 


M0003342-C:A07 


CH09LNL 


1007 


401 120 


RTA00002685F.c.23.2.P,Seq 


F 


M00039379A.B03 


CH12EDT 


1008 


403397 


RTA00002687F.h.02.2.P.Seq 


F 


M000402l9B:D02 


CH14EDT 


1009 


449337 


RTA00002690F.C. 18 J.P.Seq 


F 


M00042774C-C05 


CH16COP 


1010 


403561 


RTA00002688F.d.06.2.P.Seq 


F 


M0004038T:E07 


CH14EDT 


ion 


134182 


RTA00002692F.d,13.2.P.Seq 


F 


M00043011A:H12 


CHI SCON 


1012 


377085 


RTA00002678F.n. !4.2.P.Seq 


F 


M000396I9B.D02 


CH09LNL 


1013 


376138 


RTA00002674F.m.05.!.P.Seq 


F 


M00039l69A:El2 


CH09LNL 


tou 


401154 


RTA00002685F.e.06 . 1 .P.Seq 


F 


M0003949X:C06 


CH12EDT 


1015 


449825 


RTA0000269IF.b.l4.3.P.Seq 


F 


M00043320B:A07 


CH17COHLV 


1016 


403896 


RTA0O002687F.a.04.2.P.Seq 


F 


M00039746C:H05 


CH14EDT 


1017 


377632 


RTA0O002633F.U8.2.P.Seq 




M000400S~D:F08 


CH09LNL 


1018 


450845 


RTAG000269!F.f.l0.2.P.Seq 


F 


M00043410C:A09 


CH17COHLV 


1019 


450045 


RTA0000269 1 F.e, 1 OJ.P.Seq 


F 


M0004339!A;C10 


CH17COHLV 


1020 


402962 


RTA00002686F,d.22.l. P.Seq 


F 


M00040147D:H11 


CH13EDT 


1021 


427674 


RTA000O2665F.i.l0.l.P.Seq 


F 


M00028775D:F03 


CHOSLNH 


1022 


403252 


RTA00002688F.C. 1 5.2.P.Seq 


F 


M000403S3D:C04 


CH14EDT 


1023 


452038 


RTA00002692F.a.09. 1 .P.Seq 


F 


M00042623D:D07 


CHI SCON 


1024 


401553 


RTA00002635F.d.08.2,P,Seq 


F 


M000394S23:G02 


CHI2EDT 


(025 


451092 


RTA0000269IF.d. 17.3. P.Seq 


F 


M0004337~-\.C03 


CH17COHLV 


1026 


403978 1 


RTA00002687F,g.09.2.P.Seq 


F 


M0004020SB.A07 


CHI4EDF 


1027 


377186 


RTA00002682Fm.07. i .P.Seq 




M000-+00 1 4 D: r 0^ 


V-riUVL>L 


102S 


404679 


RTA00O02687Fi.07.2.P,Seq 




M00040203A:H06 


CH14EDT 


1029 


373875 


RTA0000:674F iC .05.i.P ; Seq 




M00039i:aC:H02 


CH09LNL 


1030 


128841 


RTA00002635F.0. 1 5.2.P.Seq 




M00039630C:H04 


CHI2EDT 


1031 


33971 


RTA000027l3F.h.l3.!.P,Seq 




M0002739:3:H02 


CH04MAL 


1032 


332873 


RTA00002666F.h. 13.1. P.Seq 




M000325^C:B0l 


CHOSLNH 


1033 


400781 


RTA000026S5F.j.05.2. P.Seq 




M000395623.G02 


CH12EDT 


1034 


456456 


RTAOOOO:694F.b.22.l. P.Seq 




M0004344OA:E12 


CH20COHLV 



WO 01/02568 



PCT/US00/I8374 



CLUSTER 
402337' 
401974 



SEQ NAME 
RT A 00Q02636FJ.07J,p,S^~ 



I ORIENTATION! 
T 



455141 
402057 



RTA00002686FJ. 1 5. 1 .P.Seq 



CLONE ID 
M00040257D:H10 



RTA00Q02694F.b. 14-ljjeq] 



F 



M00040223A.C05 



RTA00002686F.l.l4.1,pSeq 



^402555 RTAQ0002686F.ni. 1 4. 1 ,R$ eq | 



M0004344QC.B07 
M00040260CD04 



CH13EDT 



CH20COHLV I 



CH13EDT 



^06092 r RTAQ0002685F.k.) l.LP.Seql 



M00040267CC04 



374351 RTAQ0002674F.i.20. 1.P.Seq 



M00039584CC11 



402365 [ RTA00002686F.j.03. 1 .P.Seq , 



M00039I47A:FIO 



401828 



447669 
402588 



RTA0Q002686F.j. 14,), P.Seq 
RTA00002689F.a. I i.lP.Seqj 



M00040230A;H02 



M00040232D:B07 



CHI3EDT 



CHI2EDT 



CH09LNL 



CHI3EDT 



CH13EDT 



244858 



RTA000O2686F.k. 181 ,P.Seq | 



M000425;iB:E06 



402339^ 

401766 

402952 



RTAQ0002686F.K02.LP.Seq 
RTA00002686F.i.20.1. P.Seq 



M00040254B:CIO 



M00040256A:A06 



M00040226A:H10 



CHI5CON 



CH13EDT 
CH13EDT 



CH13EDT 



RTA00002686F.Q. 1 6. I.RSeql 



M00040282A:A03 



CHI 3 EOT 
CHI3EDT 



RTAQQ002686F.g. 14, I.P.Seq 



449669 I RTA00002690F.C, 1 0.3.P.Seq 



MO0O4018ID:HI0 



400520 rRTA00002685F.g.ft4-2.P,$<q1 



M00042767B:GI0 



CHI6COP 



403868 rRTA00Q02687F.k,05. 1. P.Seq 1 



M000395I2C:D06 



CH12EDT 



403242 rRTAQ0002687FJ.Q5. 1 .P.Seq 



M000403I3C:H1I 



CH14EDT 



402182 
449269 



RTA0OOQ2686F,f. 1 6. 1 .P.Seq 



M00040323B:CI2 



CH14EDT 



M00040I74C:E10 



CH13EDT 
CHI6COP 



RTA0000:690F.c. 12.3-P Seq ] 



40 1290 I RTA00002685F" n. 10. 1 .P.Seq 1 
448420 I RTA00002690F.d T Q7.j.P > Seqi 



M00042770B:BI2 



M00039606B:D08 



CH12EDT 



374351 [ RTA00002674Fx20.1P.Seq 



M00042790C:C07 



CH16COP 



443464 RTA00002690Fx s Q8.3.P.Seq1 



M00039147A:F10 



CH09LNL 



401079 RTA00002635F,p.05.2.p!Seq1 



M00042765C.D04 



CHI6COP 



403916 RTA00002687F.j.UJ P.Seq 



M00039643C:B04 



CH12EDT 



1062 


401374 


RTA000O2685F.p.07.2.p,Sea 


l~ F 


| invvwTVJltL'.nUJ 

M00039645CE0I 


CH12EDT 


1063 


,400503 


RTA00002685F.k,02. LP.Seq 


t" F 


I M00039570B:O10 


CH12EOT 


1064 


219825 


RTA00002664F.h.06,2.P.Seq 


1 F 


( M00027396D:G08 


CH04MAL 


1065 


377732 


RTA0000263 1 F,p.09 ? 2.P.Seq 


F 


1 M00039910C:G10 


CH09LNL ! 


1066 


380348 


f RTA00C02684F.d. 1 2. 1 .P.Seq 


1 F" 


| M0004012IB;C05 


CH09LNL 


1067 


449549 


RTA00002690F.a.09.3.P.Seq 


f F 


M00042431C:F0I 


CHI6COP j 


1068 


402223 


RTA00002686F.f05. 1. P.Seq 


1 F 


M00040169B:FOS 


CH13EDT 


1069 


401727 


RTA000026S5F.o.23.2.P.Seq 


F 


M00039642D:H09 


CH12EDT 


1070 


379878 


RTA00002632F.H.! 2. I.P.Seq 


F 


{ M00039984A:C02 


CH09LNL 


M071 
M072 


378602 
448065 


RTA00C0:68!F.a.08J.P l Seq 


F 


M00039839C:E05 


CHO^LNL 1 


f 1073 


403493 , 


RTA00002690F.C.22.3 P.Seq j 
RTA00002687F.j.03.1, P.Seq 


F 
F 


M00042781A.A07 
M00040313D:E04 


CH16COP I 
CHI4EDT 


1074 


400517 


RTA0000:687F.k. 15.1 .P.Seq 


F 


M00040320D:F02 


CH14EDT 


1075 


456636 


RTA00002694F,e.05. I.P.Seq 


F 


M00043632D:F09 


CH20COHLV 


1076 


400101 i 


RTAO0C02685F.O.04. 1 .P.Seq 


F 


M00039625B:G08 ! 


CHI2EDT 


1077 


403578 | 


RTAOOCO2637F.I.0 I.I. P.Seq 


F 


M00040296D:E09 


CHI4EDT j 


1078 


402419 | 


RTA00002686F.g.20. 1.P.Seq 


F 


M000401S4C.-AII 


CHI3EDT 


1079 


375161 


RTA0OCO2676F.n.OL2.P.Seq 


F 


M0003931PB:H12 


CH09LNL 


1080 


401851 | 


RTA0OC02636F.d.O"l. P.Seq 


F 


M00040I43A:H05 


CH13EDT 


1081 


400567 1 


RTA000026S5F,a.l4.: i P.Seq 


F 


M0003936IB:E01 


CHI2EDT J 



WO 01/02568 
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SEQ 
(D 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 
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P 
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l\JO / 
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4 1 J OH J 


RTAAAAA7£0<F nO^IP St*A 


F 
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vrt 1 lt.U I 


1 UoO 


44oo /4 


DTAAAAA7AOAF r A'* t P ^a 

iv i huuuu— ovur .c.u— ,_>.r. 


= 


IVIUU04« fJ V D,VJ 1 I 


L.rt lOl-Ur 


iarq 

1 UOT 


77/»s" 1 1 


dti AnAn*>A7iiir h n^i i p ^*>a 
IV 1 HU0OO*O /**r .n*u**. i .r . jcq 


F 


VftAAAlQI IAA-DA9. 
IViOOO^" 1 40 A.oOo 


rjjAOl mi 


man 


17 lAtlA 
J /4U40 


RTA AflAA7A7JP h 71 1 P C^ri 


c 
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r^UAOl MI 


1/iQi 
I \rf \ 
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IV 1 AUUUUiOVJr X. 1 O. l .r. jcq 





KslAAAilliOl \, Art! 


v-rt I yv*vJr 


1 A07 


4W43o 1 


K 1 A00OO-OO fT ,2. 1 i . 1 .r.ocq 


— — 


M AAA_ i A7 A 9 ri - 1". AO 
IVI 00O4UZ 0 o U . U 0 V 


i-ri 14 tu 1 


i noi 
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= 
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uri I3v>UIN 


IUV4 




OT A AAAA7JC07C n 1 7 1 O Cjo 

Kl A00OOJO5/r .o. 1^. I .r T ocq 


r 

L ~ 
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f 
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L 
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1 
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I 100 
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L — ; — 
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ivlUUU4_/ I . d, A I 1 
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F 

L 
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pLi-tApnHT V 

c n— ULL/tiL. v 
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R I A00OO- / 1 Or .a, 1 > , 1 .r.>eq 




\ IAAA"> t A A1 

M00O- I ooc -J. ,\\JJ 
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,i A.t i i a 
4U4 1 I v 
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— = 
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\W\ 
\ X \j 


AfilAAl 
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_ ^ 
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F 
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RTAnflfiA7tt.Q7F n 1 1 1 P Si»n 


F 

- 






117 1 
1 1 — I 




RTArtftnfPnOflF c» In P Sen 








1 122 


452723 


RTA00002692F.e. 1 8.2. P.Seq 


^ 


M0004303cC:E05 


CH18CON 


1123 


270014 


RTA00002685F.il5.2P.Seq 




M00039536C:H11 


CH12EDT 


1124 


401198 


RTA00002685Fi.l42.PSeq 




M0003Q536C:C10 


CHI2EDT 


1125 


452414 


RTA00002692F.e. 12.1. P.Seq 




M000J3032C:A10 


CHI8C0N 


1126 


453019 


RTA00002692F.d.lS.2.P.Seq 




M000430I5 A:H10 


CHI8C0N 


1127 


403642 


RTA00002687F.C.24.1. P.Seq 




M00039945C:F09 


CH14EDT 


1128 


401437 


RTA00002685Fx.tS2. P.Seq 




M00039377D:E12 


CHI2EDT 



WO01/02S68 



PCT/US00/I8374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1 129 


452414 


RTA0000^69^F c I 7 1 P Sea 


— 


M0004303 7 C* A10 




1 130 


404122 


RTAOO0C687F n 10 1 P Sea 





M00040334DB0 7 


CM I4FDT 
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400567 


RTA0G00 7 68^F a 14 1 P Sea 




M0003936ifiE01 


CHPFDT 

will —CVJ 1 
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p 

• 
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RTAOOOO~>676F f ?? 7 p c efl 




MOOOjO^OlBri 1 

ITIUUV J 7— 7 J O . \». 1 \ 


CH00I MI 


1 135 


407J11S 


RTA0000' 1 686F b 7 4 1 P Sea 


F 


M00040 1 3 I D-G08 


CH13EDT 


1 136 


401774 


RTA0000 o 687F d 08 1 P Sea 


-p 


M00039947rri03 


CH14EDT 

V— ' lilt t-n-K* 1 


1 117 


HJJUJ 


RTA000077HF d 04 1 P Sea 





M 000*7! -i 77R- F0 1 




i iir 


4S707I 


RTA0000">69?F c 05 7 P Sen 


p 


M00l f 14?070 R- F0? 


CHI SCON 


1 1 10 


440ft!? 


RTAOO00?69! F pll 1 P Sen 

rv 1 ,~\ \J' J\J\J yj 711 -C 1 J. i.r . jtvj 


= 


M0004110! AR08 

ITIUwvtJ J 7 j r\ .QUO 


CH17COHLV 

X. Ill' X» \Jl 1 lu w 


I 140 

I l*T\I 


170004 


RTAOOOO^fllF n 09 "> P Sea 


— P 




MO0O400O1 RT07 


CH09LNL 


1 111 1 


4S^7 1 1 


RTA00007A04F h 07 1 P Sea 


— 


N40OO4 14 1 0 R • f""0? 


ru7QroHLV 


1 Id? 

1 14 X 


1700? 1 


RTAOO007AA1F r\ \ 1 7 p C^n 
ix 1 inUV/UU— oo Jr.n. i j -_.r .oc(^ 




M000400Q 1 D' DO! 


CHO0I Nl 

V. nv7LJiL 


1 Id! 
1 |4j 


17/»?70 
j /Ox /V 


RTAOO00^6X0F H 1 0 "* P S*»n 


T 


mooo i07ssn-nos 


CH00I mi 


1 144 


l7d77*5 
J /4j / J 


DTAOO007651 1 F n ? 1 1 P S*»n 


p 


M0001Q001 A -M07 


fHOOl VII 
Vw nu7Li>L 


1 14J 


v/ooo 


RTAOOOO^AJIAF f4 10 1 P St*n 


p 




MO0040 i4>n*noi 


r ui IFHT 
n i jcu i 


1 140 


4UU4U/ 






X/fOOOIQ 1 R4 A - HO! 




114/ 


4UZVU4. 


RTAOOOO^/i^AF n 1 ■> 1 P S**n 


= 


MOOOdO^ 74 A - H 1 1 


CHI 1FDT 
v- n i j lu i 


1 1x19 
I 140 


4UJV 1 x 


RTA0O007AS7F i 10 1 P S^rt 
K 1 AUUUU-O0 / r J. 1 7. t .r . jcC| 


™" r 


MOOOJ01 1 7 A WO! 


CM I4FHT 


1 t JO 
1 14V 


4UU3 I I 


OTA Artfin">/;JKF K ?1 ^ P <s^n 


■ 




runpnT 

Lnl tU I 


1 t jU 


4U.£ /4o 


RTAOOOO^ARAF i Id I P 


= — 


\i10nrtl07JfiR- F 1 0 


ru I -jpnT 
v^n t jll/ i 


1 1 f 1 


.ia^q to 


DTAOO0n n AJl7F n 00 "* P St»n 


F " 

. 


mooo am ' i nr.ns 




1 1 <? 


4UI4/ 1 


DTAnOOO^AR^F n 10 1 PQpn 


— . — 


MOflO^0£7QR*F0 ! 


rHPFfYT 


1 1 j j 


4U4 jQ^ 


PT A 0000? A 97 F rtfl/i^p Q^n 





MOOOdO^ - * R- D 1 7 


r u | JFDT 


I I j4 


j /jo41 


dt a nnnn^/i 77 F i oo P Q^n 
K 1 AUUUU-0 / /r.l.uv.^.r.ocq 


. 


MOOO^OdO! A -n 1 ^ 


CH001 Ml 

Vw nv7Lii Li 


1 


401 On? 


RTA0000^686F i 10 1 P S«»n 




Mooodn^i l R-rns 


CH13EDT 


1 l JO 


4UU00J 


3T AOOOO^^S^F mflO 1 P S^a 


= 


M0OOtQ<07nF04 


CH 12EDT 


1 t ^7 




RTA000076S6F n 0^ 1 P Sea 





M00040^7 IRFP 


CH13EDT 


1 1 JO 


IftOdA? 


RT AOO00^670F n 0 1 ? P Sea 
iv i .Auuuy-O /ur.u.u i ._.r . ^cuj 





M 000 1 1 ^ 70 R ■ F 06 


CH09LNL 




40007S 
4UVU/0 


RTA0000^61x5F m IP P Sen 


p — " 


MOOO19600A* A 1 I 
|V|vW J7UVU rx p .t> i i 


CH12EDT 


i 160 

1 1 \J\J 


17174S 
j f j /40 


RTAO000" , 67 1 F 1 06 3 P Sea 


p 




CH09LNL 


1161 


40 1 10? 

4U 1 J7- 


RTA0OOO?6S5F f 08 "* P Sea 


= 


MOOOjQ^O^CEOj 


CH12EDT 


1 1 6? 


?0>d* 


RTA0000 n 7l0F h 11 1 P Sea 


r 


M000" 17 " , >4* r A E07 


CH03MAH 


1 16! 


176^70 


RTAOOOO^SSOF d 10 1 P Sea 


— — p 


M0003978->DG05 


CH09LNL 


1 1 fid 


1744?* 


RTA0000* ? 67?F a ^0 i P Sen 


— 


M000186HR GO 7 


CH09LNL 


1 IAS 




RTA0000 n 67 7 F a ^0 ~* P Sea 


F 


M 000386" 'B GO 7 


CH09LNL 


1 166 


177014 


RTA0O0O n 679F i 7 I 1 P Sea 


— 


M00039696A'EO^ 


CH09LNL 




!7X!?0 


RT A0000 7 68 1 F 1 14 ^ P Sea 


— 


M00039894CH07 


CH09LNL 


1 16R 


?!^4?'> 


RTA0000" , 665F h 19 1 P Sea 


P 


M000' 7 876SC , D05 


CH08LNH 


1 169 


402473 


RTA0000:686F.p.ll.l.P.Seq 


p 


M000402SX:B09 


CHI3EDT 


1170 


374828 


RTA00002674F.m i 10. l.P.Seq 




M0003°I70A:BIO 


CH00LNL 


1171 


403912 


RTA000O2687F.j.l9.:.P.Seq 




MOOO403l"A:H03 


CH14EDT 


1172 


401471 


RTA00002685F.o.!0.2.P.Seq 




MO0O3962«B:FOI 


CH12EDT 


1173 


404362 


RTA00002687F.O.06. 1 .P,Seq 




M00040342B:DI2 


CH14EDT 


1174 


403849 


RTA000026S7F,n.09.I.P,Seq 




MOOO40333D;GO5 


CHI4EDT 


1175 


395617 


RTA00O0:687F.b.i5.lP,Seq 




M0003976"B:A04 


CHI4EDT 



9% 



WO 01/02568 



PCT/IISOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


(176 


401709 


RTA000026S5F,o,0L2.PSeq 




M00039624A:H09 


CH12EDT 


1177 


404464 


RTA00002687F.O.22. 1 .P.Seq 


p 


M00040347O:F09 


CH14EDT 


1178 


447795 


RTA00002689F.e.06.3. P.Seq 


p 


M00042395CG01 


CHI5CON 


1179 


18139 


RTA00002708F.f. 10. 1 .P.Seq 


p 


M00004139B:B10 


CH01COH 


1180 


403898 


RTA00002687Fa.05. 1 .P.Seq 


p 


M00039746CH06 


CH14EDT 


1181 


453512 


RTA00002693 F.a.2 1 .2.P.Seq 




M00043078D:D04 


CH19COP 


1182 


404172 


RTA00002687F,d. 1 7. 1 .P.Seq 


p 


M00039951B:B12 


CH14EDT 


1183 


400973 


RTA00002685F,c.06.2.P.Seq 


F 


- M00039374CH12 


CH12EDT 


1184 


450198 


RTA0000269 1 F.e.23.2.P.Seq 


P 


M00043405A.DI 1 


CH17COHLV 


1185 


451502 


RTA0000269 1 F.f.03.2.P.Seq 


■ p 


M00043406B:GI2 


CH17COHLV 


1 186 


454414 


RTA 00002693 Ft". 1 8,2 . P.Seq 


p 


M00043220B:C04 


CH19COP 


1187 


453752 


RTA00002693F.b.02.2.P.Seq 


F 


M00043081D:F05 


CH19COP 


1188 


403700 


RTA00002687F.g,03. 1 .P.Seq 




M00040207B:D08 


CH14EDT 


1189 


403371 


RTA00002687F,h. 1 9. 1 P.Seq 


p 


M00O40294D:Dl2 


CH14EDT 


1190 


14583 


RTA00002687FX08. LP.Seq 




M00040203B:A05 


CH14EDT 


1191 


404161 


RTA00002687F.e.20. 1 .P.Seq 


p 


M00039958C:B09 


CH14EDT 


1 192 


403274 


RTA00002687F.b. 10, LP.Seq 




M00039766A:G07 


CH14EDT 


1 193 


373465 


RTA0000267 1 F.o.09, 1 .P.Seq 


p: 


M000386l^A H12 

■ ▼•WWW # w tl i 1 1 * — 


CH09LNL 


1 194 


402582 


RTA00002686F.m.08. LP.Seq 




M00040^6*>DC08 


CH13EDT 


1195 


40224 1 


RTA00002636F.1. 1 6. 1 . P.Seq 


P 

— _ 


M00040261C:F01 


CH13EDT 


1 196 


38045 1 


RTA00002670F.p. 1 2. 1 .RSeq 




M00033581DD08 


CH09LNL 


1 197 


455938 


RTA00002694F.d.24. 1 ,P,Seq 




M00043528C AO' 1 


CH20COHLV 


1 198 


374297 


RTA00002672F.i.02.2.P.Seq 




MO0O390!3D:F02 


CH09LNL 


1199 


402624 


RTA0O002686F.p. 13. 1 .P.Seq 




M000402S7D:D07 


CH13EDT 


1200 


402322 


RTA000026S6F j. 16. 1 .P.Seq 


F 


M00040233A:H02 


CH13EDT 


1201 


449504 


RTA0O0O2690Fx.IL2.PSeq 


F 


M00042769C:E09 


CHI6COP 


1202 


226704 


RTA00002664F.X 1 1 . LP.Seq 


" "p 


M00023352D:H03 


CH04MAL 


1203 


271092 


RTA00002690F.b.23.2.P,Seq 


— _ 


M00042756D;AIO 


CH16COP 


1204 


400864 


RTA00002685F.2. 1 7,2,P.Seq 


F 


M000395I7B:G12 


CH12EDT 


1205 


235855 


RTA00002667F.O.06. 1 .P.Seq 


" 1 1 p " 


M00032876C:D06 


CH08LNH 


1206 


402789 


RTA00002686F.g. 16. 1 .P.Seq 


P 


M00040I83A:F07 


CH13EDT 


1207 


19S26 


RTA000027 1 OF.k.05. 1 .P.Seq 


F 


M00022467C:BI2 


CH03MAH 


1208 


380157 


RTA00002632F.h. 1 9. 1 ,P.Seq 




M00039984D:G12 


CH09LNL 


1209 


401187 


RTA00002635F.e. 15.2. P.Seq 


F 


M00039500C:C04 


CH12EDT 


1210 


427346 


RTA00002665F.b.0 13. P.Seq 


F 


M00028066CD07 


CH08LNH 


tan 


402 866 


RTA000026S6F.C, 15.1 .P.Seq 


" ' p 


M00040I38B:H03 


CH13EDT 


1212 


376712 


RTA00002677F.c.l3.2.P.Seq 


p 


M00039343B:FI2 


CH09LNL 


1213 


401655 


RTA00002685F.C.22.1. P.Seq 


p 


M00039378D:H0 7 


CH12EDT 


1214 


400147 


RT A000026S 5 F.g. 1 0. 1 .P.Seq 


p 


M00039515A:A06 


CH12EDT 


1215 


400864 


RTA000026S5F*. 17. LP.Seq 


p 


M000395I7B:GI2 


CH12EDT 


1216 


451600 


RTA0000269iF.bl93.PSeq 


p 


M0004332SD:H02 


CH17COHLV 


1217 


400147 


RTA000O2685F.g.IO.2.P.Seq 




M00039515A:A06 


CH12EDT 


1213 


401655 


RTA00002685F.C.22.2. P.Seq 




M00039378D:H07 


CH12EDT 


1219 


449307 


RTAOOO0269OF.aJ0.3.P.Seq 




M0004243ID:C10 


CH16COP 


1220 


403121 


RTA0000268SF.a.01.2.P.Seq 




M00040366A:B01 


CH14EDT 


1221 


451718 


RTA000026^2F.e.24 ! LP,Seq 




M00043044B:A12 


CHI SCON 


1222 


294345 


RTA000026S5F". 14. LP.Seq 




M000395I5D:CI 1 


CH12EDT _ 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1223 


186541 


RTA000027 l2F.p.23.2,P,Seq 


F 


M0OO27l8iD:A05 


CH04MAL 


1224 


403898 


RTA00002687F.a.05.2.P.Seq 


F 


M00O39746CH06 


CH14EDT 


1225 


403541 


RTA000026S7F.p,20, 1 .P.Seq 


F 


M00040364A:E05 


CH14EDT 


1226 


450773 


RTA0000269 1 F.d,24.j.P/Seq 


F 


M00043383D:A02 


CH17COHLV 


1227 


376236 


RTA00002685F.I.24.2,P,Seq 


F 


M00039595C-E05 


CHI2EDT 


I22S 


422357 


RTA00002688F.C.2 U .P.Seq 


F 


M00O403S5C;DO2 


CH14EDT 


1229 


404532 


RTA00002687F.p.l0,2.P,Seq 


F 


M000403513:F02 


CHI4EDT 


1230 


403693 


RTA00002687Fj,23.l. P.Seq 


F 


M00040317D:F02 


CH14EDT 


1231 


403693 


RTA00002687F,j.23.2.P.Seq 


F 


M00040317D:F02 


CH14EDT 


1232 


401515 


RTAOO0O2685F.a02.2P.Seq 


F 


M00039624B:F12 


CH12EDT 


1233 


404532 


RTA00002687F.p. 10. LP.Seq 


F 


M00040351B:F02 


CH14EDT 


1234 


452077 


RTA00002692F.d.O 1 .2.P.Seq 


F 


M00043002A:E05 


CHI SCON 


1235 


18003 


RTA000027 1 1 F.b.04. 1 P.Seq 


F 


M00022S21C:C09 


CH03MAH 


1236 


377014 


RTA00002682F.F. 13.1. P.Seq 


F 


M00039973D:C08 


\ CH09LNL 


.1237 


( 404232 


RTA00002687F.n. 12.2,P.Seq 


F 


M00040334D:C07 


CHI4EDT 


1238 


404232 


RTA00002687F.ii. 12.1. P.Seq 


F 


M00040334D:C07 


CHI4EDT 


1239 


406263 


RTA00002685F.d. 14.1. P.Seq 


F 


M00039493A:C04 


CHI2EDT 


1240 


452077 


RTA00002692F.C.24.2.P.Seq 


F 


M00043002A:E05 


CHI SCON 


1241 


454349 


RTA00002693F.c,09,2.P.Seq 


F 


M00043 1 33 B:Cll 


CH19COP 


1242 


447671 


RTA00002689F.e, 12.1. P.Seq 


F 


M00042904B:E07 


CH15CON 


1243 


447603 


RTA00002693F.b. 14.2.P.Seq 


F 


M00043095A:F09 


CH19COP 


1244 


456764 


RTA00002694F.C. 14. LP.Seq 


F 


M00043465 3.H02 


CH20COHLV 


1245 


401827 


RTA00002686F.I. 19.1. P.Seq 


F 


MQ0040262B:B06 


CH13EDT 


1246 


404520 


RTA00002687FX05. 1 P.Seq 


F 


M00040202A.F05 


CH14EDT 


1247 


449798 


RTA00002691 F.d.02.3.P.Seq 


F 


M00043366A:A02 


CH17COHLV 


1248 


450993 


RTA0000269 1 F.c. 1 2.3. P.Seq 


F 


MOO04335OD:Bll 


CHI7COHLV 


1249 


377471 


RTA0000269 1 F.c.02 JP.Seq 


F 


M00043339A:F1 1 


CH17COHLV 


1250 


400404 


RTA00002686F.a. 17.1 P.Seq 


F 


MOO03975ZBGO8 


CHI3EDT 


1251 


19106 


RTA0000269 1 F.e.08 2.P.Seq 


F 


M0004333$C:E03 


CH17COHLV 


1252 


404024 


RTA00002687F.e. 18.1. P.Seq 


F 


M0003995SA:A08 


CH14EDT 


1253 


446404 


RTA00002689F.b. 1 4. 1 ,P$eq 


F 


M00042566C.C05 


CHI SCON 


1254 


392921 


RTA00002677F.k. 12.2. P.Seq 


F 


M0003941 !C;E07 


CH09LNL 


1255 


376850 


RTA00002678F.eJ02.RSeq 


F 


M0003945SB;HII 


CH09LNL . 


1256 


453011 


RTA00002692F.f.l0.2.P.Seq 


F 


M00043066B:H11 


CHI SCON | 


1257 


2348 1 1 


RTA0000269 1 F,a,03 . 3 . P.Seq 


F 


M00042352D:C01 


CH17COHLV 


1258 


402708 


RTA00002686F,m. 11. LP.Seq 


F 


M0004026~A:E06 


CH13EDT 


1259 


451013 


RTA00002691 F.f.08.2.P.Seq 


F 


■ M0004340QB:B03 


CHI7COHLV 


1260 


453011 


RTA00002692FX 1 0. 1 .P.Seq 


F 


M00043066B:H1I 


CHI SCON 


1261 


380462 


RTA00002670F.n.24.2.P.Seq 


F 


M000335~OB:E06 


CH09LNL 


1262 


379602 


RTA0000268lF.c.2l.2.P.Seq 


F 


M00030855C.F01 


CH09LNL 


1263 


403896 


RTA00002687F.a.04.l.RSeq 


F 


M00039746CH05 


CH14EDT 


1264 


403397 


RTA00002687F.h.02.l.P.Seq 


F 


M000402i9B:D02 


CH I4EDT 


1265 


271723 


RTA0OGO26S6F.b.05. LP.Seq 


F 


M000^:55A:308 


CH13EDT 


1266 


451379 


RTA00002691F.b.l2.2.P.Seq 


F 


M00043312CE08 


CH17COHLV 


1267 


456624 


RTA0Q002694F.e.02, LP.Seq 


F 


M000436163:F02 


CH20COHLV 


1263 


375483 


RTA000026S6F.n.l4,L P.Seq 


F 


M00040r4A:D07 


CH13EDT 


1269 


402229 


RTA00002686F.i,09. LP.Seq 


F 


M00040::iA:GU 


CH13EDT 



Qs 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEO NAME 


ORJENTATION 


clone :d 


LIBRARY 


1270 


377039 


RTA00002686F.0, 1 2. l.P.Seq 


p 


M00040:80C:H05 


CHI 3 EOT 


1271 


18041 


RTA000027IOF.h.2I.I.PSeq 


F 


M00022262D G03 


CH03MAH 


1272 


401381 


RTA00002685FO.08. 1 ,P,Seq 


p 


M00039626D:F04 


CH12EDT 


1273 


428491 


RTA00002666F.c.05,l.P,Seq 


p 


M00032535D:HOI 


CH08LNH 


1274 


54656 


RTA0000266 1 F.i.22.2P,Seq 


p 


M000043~2B;F07 


CHOICOH 


1275 


379183 


RTA00002679F.U7.l.P.Seq 


p 


M00039638CG06 


CH09LNL 


1276 


25594 


RTA0000271lF,f,07.I.P.Seq 


p 


M00022968B:E02 


CH03MAH 


1277 


403355 


RTA0OOO2687FU1 U.P.Seq 


p 


M0003994SO:DU 


CH14EDT 


1278 


16789 


RTA00002709F.b.09.2.P.Seq 


p 


M000O5382B:F08 


CH02COH 


1279 


23292 


RTA00002708F.C.02. 1 .P.Seq 


p 


M00003750D:E06 


CHOICOH 


1280 


373982 


RTA00002673F.b.24.2.P.Seq 


p 


M00039058A:A04 


CH09LNL 


1281 


373982 


RTA0OO02673F.c.0l,2.P.Seq 


p 


M00039058AA04 


CH09LNL | 


1282 


449911 


RTA0000269 1 F.e.02.2,P,Seq 


p 


M0004338^B:B02 


CH17C0HLV 


1283 


450633 


RTA0000269 1 F. f.02.2. PSeq 


p 


M00043405C:G12 


CH17C0HLV 


1284 


23939 


RTA000027 1 3F j. 14. 1 .P.Seq 


p 


M00027486A:F06 


CH04MAL 


1285 


450633 


RTA00O0269I F.f.02. ! .P.Seq 


p 


M00043405C:GI2 


CH17C0HLV 


1286 


379122 


RTA00002672F.n. 14. 1 .P.Seq 


p 


M00039033B:F09 


CH09LNL 


1287 


449429 


RTA00002690F.a.l6,3PSeq 


p 


M000424:-A:D04 


CHI6C0P 


1288 


430578 


RTA00002668F.g. 1 8. 1 .P.Seq 


p 


M000329S-iC:G05 


CH08LNH 


1289 


425824 


RTA00002687F.b. 1 7, 1 P.Seq 


p 


M0003976~C:EI2 


CHI4EDT 


1290 


425824 


RTAQ0002687F.b. 1 7,2.P.Seq 


p 


M0003976*C:E12 


CH14EDT 


1291 


401266 


RTA00O02685F.U 1.2.P.Seq 


p 


M00039535D:D10 


CHI2EDT 


1292 


377949 


RTA00002674F.p.04. 1. P.Seq 




M00039200A;CIO 


CH09LNL 


1293 


12926 


RTAOOO0~>7 1 0F.e.2 1 . 1 .P.Seq 




M00022005CC06 


CH03MAH 


1294 


378242 


RTA00002679F.c,2Q.2,P.Seq 




M00039664D G07 


CH09LNL 


1295 


401781 


RTA00002686F,e.08. 1 , P.Seq 




M00040160B;A10 


CH13EDT 


1296 


453101 


RTA00O02693 F.c. 1 6.2. P.Seq 




M00043U3a:A10 


CH19C0P 


1297 


377592 


RTA00OO2677F.1. 1 2.2.RSeq 




M000394i5D:E0l 


CH09LNL 


1298 


404340 


RTA00002687F.b.05. 1 P.Seq | F 


M0003976-CD07 


CHI4EDT 


1299 


400968 


RTA000026S5F.h.0 1 .2.P.Seq 




M0003952!DH03 


CH12EDT 


1300 


400968 


RTA00002685F.g.24.2.P.Seq 




M000395:iD:H03 


CH12EDT 


1301 


374417 


RTA0000267 1 F.j. 1 5.3. P.Seq 




M00038315CG1 1 


CH09LNL 


1302 


374621 


RTA00002675F.p,02. 1 .P.Seq 




M00039263D A 12 


CH09LNL 


1303 


19063 


RTA00002708F.L 14.1. P.Seq 




M0000436!A:H02 


CHOICOH 


1304 


135941 


RTA000027 1 3F.2.06. ( ,P,Seq 




M000273^BG05 


CH04MAL 


1305 


403355 


RTA00002687F.d.l 1 . 2. P.Seq 




M0003994SD;D1 1 


CH14EDT 


1306 


375226 


RTA0OO02677F.m.O8.2.P.Seq 




M000394rC:AOl 


CH09LNL 


1307 


222658 


RTA00002664F.e. l4.2.P.Seq 




M00027I0;3:A09 


CH04MAL 


1308 


447978 


RTA000026QOF.d.l I.J.P.Seq 




M00042800A:A03 


CHI6C0P 


1309 


431346 


RTA00002660F.g i 24 i 1 .P.Seq 




M000332ISA:C04 


CHOSLNH 


1310 


455579 


RTA00OO2694F.a i 10. 1 .P.Seq 




M000425^cB:F06 


CH20COHLV 


131 1 


13406 


RTA00002709F. 1. 14.1. P.Seq 




M0000712-D-HIO 


CH02COH 


1312 


378364 


RTA00002674F.0. 17.1. P.Seq 




MOOOJ^I^cD A07 


CH09LNL 


1313 


373788 


RTA0000267I F.c. 16.2. P.Seq 




MO003S2?^A:GO8 


CH09LNL 


1314 


403548 


RTA0000:688F.a. 10.2. P.Seq 




M000403o5D:E09 


CHUEDT 


1315 


22425 


RTA00O02709F.C.08.2 P.Seq 




M00OO54*SA:H0'6 


CH02COH 


1316 


452238 


kTA000026P2F.c.2l.2,P,Sec 




M00042OOSA.G04 


CHI SCON 



WO 01/02568 



PCT/US00/18374 



SEQ 
[D 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1317 




RTA000tP689F c 04 1 P Sen 


F 


1^0004^69" DF04 




1318 


142922 


RTA000C7PF ■*> 02.1.P Sea 


F 


M000 7 686OB*C05 




1319 


tJU 1 TO 


RT A 0000^69 1 F c 19 3 P Sea 


! F 


M00043j < 9BDl0 


CHI7fAHf V 


1320 


76A17 


RTA OOOfP 709 F d 04 1 P Sea 


F 


M00005601 DD08 






IRAI^S 

JOUJJJ 


RTAnO0O^670F o 06 1 P Sea 


F 


M000335*OCC!0 


CH09LNL I 




7017 


RTA 0000^71 OF n ^ 1 P Sea 

I\ I nvuvv« » l v/i -II-" — . i >i . ^ 


F 


M000^^66T^■B0' , 


CH03MAH 


I -57* 


17RQS7 


PTAAAA076R1F h I 1 1 P Sen 


p 




CH09LNL 


1 ^74 


4A4JR7 


&TAnnOfT*fi87F c Is ^ P Sea 


p 


M0003994 ^RFIO 

lTluUvJ7 7**JO.r IU 


CH14EDT \ 


i j— j 


4Rd5T> 


RTAnftOO*>71 7P n 06 I P^n 


r 




CH04MAL 


1 17* 

1 J — u 


17"»7A^ 


RTAnnnO' 1 fi7'>F a 13 1 P Sen 


F 


M0fi0390*"*r-F07 


CH09LNL 


1 ,/x / 


J / J /UO 




p 


M00O39O* 5 T F07 


CH09LNL ! 


1 178 


7 1 1*7 
Z I 10/ 


RTAnnfifi770QF r Hi 1 P St*n 


F 


MAnnos4-iQ r ■ no i 


CH07COH 


i ^70 


1 JZU J 


RTAnnfifi*>7lflF -i 7 1 | pCd fl 


F 
r 


N/inAnA707" 1 RH 1 7 
i»iv/uu v 17 * - o nu 


CH03MAH 

v» i iuj ivi/vn 


! i^A 




uTAnnfirt"^7AQP r m 7 p c^n 


c 
r 


MAAAASJ-lQR-nA \ 




i j j i 


4U I U I J 


RTAOnnn7AHSF n 16 "> P S^n 
IV 1 nuvvU^uojr . v. i u.^t.jci^ 


F 
r 


MAOfHOfvI ! a A 05 


CHPEDT 

V. Ill — L^lrS I 


1 *517 
1 JJi 


4U444V 


DTinnnn > 'A»7F r 7 P s^rt 

K I AUUUU-Oo / r .C.l/4.-i. r.oCq 


F 
r 


lt*iAAA";07™Ar'- FA4 


v. n itcu i 


1 J J J 


4Zvo /x 


PTAfinftf>"*A/;9F h ID 1 P Si»n 
lv 1 AwUU-OOOr .0. 1 U, 1 .r.ocq 


F 

r 


lV| \J\J\J J — 7 v . A . DUO 




1 11. A 


48341 


K I AUUUU-^ / 1 — r . i.u / . ! .r .^cq 


F 

r 




THA4.VIAI 
no**ivii-\i^ 




j 7 8424 


K 1 AUUvU_0o 1 r .a.Uj»,Z.r.j>eq 


p 

r 


2V<1AAAt0ft^0R-Rn 1 




1 JAO 


49340 


OT A /WW*** 1 *7P /I ">A 1 P Q^o 


r 


IVifAAOT'i^OQP F IA 




\"l~*'7 
1 


^79170 


K 1 A UUvU J.0 / — r .1.- 1 . 1 .r.ocq 


c 

r 


MAAA^OA i An nAA 


runol NIL 


UJO 


i*7Q< in 
1 /V34U 




F 

r 


MAAA4A I OAP FAS 


v. 1 Iu7LLiL 


i i ci 
I JjV 




pta nnfift^£Q l k fmi p 


F 
r 


MAAA4^JI iR-nni^ 


TH 1 7rOHI V 

V,*l 1 ' V— W l\ (L» V 


1 J4U 


44Voji 


pt \ nnnn^/^o it a r. i d c^ rt 
K l AUUUUwOV 1 r I .r.ocq 


c 

r 


^fAAAJ'i^Ot A RAR 


cu prom v 


1 1A 1 

1 ->4 1 


3olM IV 


PTAnnnn7A7fiF m P s*»n 

IV 1 nUUUU^O / U r .III i.r .h?cq 


F 
r 


\*f AAA "5 ^ ^riOn'nAT 


CH09LNL i 


1 147 


1 3JU74 


PTAAfinn">71dF n 1"> 1 P Spn 

i\ l rtuuuu-. / 1 **r .a. i -. i .r.jcq 


F 


WAAA7774" A *r01 


CH04MAL 


1 ->4J 


440/47 


PTAnnnm^onF h u"> p s^n 
iv i Auuuu-owr.u. i-*.«,r.3cq 


F 
r 


\#f AAAd^Sri^r- FA7 
ivioouh — ov'jv. .rui 


CH16COP 




44o /4V 


IV I aUUuU-.07ur,u. i*t.j.r.ocq 


F 


M f)0A4"* S C t C ■ F0 7 


CH16COP 


1 1A< 


4348 10 


fv 1 AUUUU^oyjr.Q. lO. 1 .r.ocq 


F 
r 


N#IAAA4^A0ri A ri04 


CH19C0P 




3/4/44 


PTAAAAA">^7AF i \f\ "> P S«*n 


F 




CH09LNL 


I j*f / 


4U444V 


RTAAAAn^ftJlVF r 04 1 P S*»n 


F 

r 


M AAA > 97^ 0C * E04 


CH14EDT 


t 14<1 




RTAAnnn" , 66 1 F h 14 1 P Sen 


F 
r 


M00004 , * > " " C * E03 


CHOICOH 


1 14Q 


43 


PTA AAAA"»A0 1 F h 1 7 ^ P Sen 
IV t AUvVv—O" i r -O- I^-J.r,v>cq 


F 
r 


MAAA4^ 1 E08 


CHI7C0HLV 


1 ISA 


430J-1J- 


RTAAAAA^^QdF ci ~> \ 1 P Sen 


F 


M00043->"^6BD10 


CH20COHLV 


1351 


HJJ7J / 


RT40000" > 694F c 15 1 P Sea 


p 


M0004346 "C A03 


CH20COHLV 


I 1^7 


4*loUOJ 


RTAAO0O -, 66^F 1 05 1 P Sen 


F 


^10003^6" SCG08 


CH08LNH 




174777 


RTA0000^676F i !9 3 P Sea 


p 


M000393 10AC07 


CH09LNL 


! 

i jji 


4„04U / 


RTA0000^66SF n P 1 P Sea 

IV I . \Uvvw- UUJ I , , 1 — ■ 1 - 1 . Jtlj 


p 


M0003" , 5 {ODFP 


CH08LNH 


1355 


17&000 


RTA0000^68I F i 16 1 P Sea 


p 


M0003988^D:C04 


CH09LNL 


1 JJU 


J-\T7 1 7 


RTA0000^697F b 17 7 p Sea 


p 


M0004 H, 966CE06 

|T|VW~— ' Wi» «^ww 


CH18C0N 


1357 


378000 


RTAO00O:68lF.j.I6.2,P,Seq 


F 


M0003988"D:C04 


CH09LNL 


1358 


448356 


RTAOO0O2690F.c.03.3,P.Seq 


F 


M00042760A:C12 


CHI6C0P 


1359 


456629 


RTA00002694F.d.04. 1 ,P,Seq 


F 


M0004349!C:F04 


ch:ocohlv 


1360 


431346 


RTA00002669F.g.24.2.P.Seq 


F 


M0003321SA:C04 


CH08LNH- 


1361 


377206 


RTA00002682F.m.l4.LP.Seq 


F 


M000400 1 5C :F08 


CHOQLNL 


1362 


453036 


RTA00002692F,b. 1 1 .2.P.Seq 


F 


M00042960D:H08 


CHI SCON 


1363 


402632 


RTA0000:686F.gJ5.I.P.Seq 


F 


M000401S:D:D06 


CHI3EDT 



WO 01/02568 



PCT/US0O/I8374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE (D 


LIBRARY 


1364 


230532 


RTA00002664F,c.ll.2.P.Seq 


F 


M00026901A:G07 


CH04MAL 


1365 


30755 


RTA00002663F.e.03. 1 .P.Seq 


F 


>I00022133A:E05 


CH03MAH 


136.6 


451438 


RTA0000269 1 F,d.23.3.P,Seq 


F 


M00O43333CFI2 


CHI7C0HLV 


1367 


3790 11 


RTA0O0O268IF.n.23.I.P.Seq 


F 


M00039903CDOI 


CH09LNL 


1368 


404048 


RTA00002687F.g.0 1.1. P.Seq 


F 


M00040206A;A07 


CH14EDT 


1369 


404048 


RTA00002687F.g.0 1 IP.Seq 


F 


M00040206A:A07 


CH14EDT 


1370 


452398 


RTA00002692FT. 1 7.2.P.Seq 


F 


M00043125CA11 


CHI SCON 


1371 


403686 


RTAQ0002687F.d.03. 1 P.Seq 


F 


M000399463:F08 


CH14EDT 


1371 


403686 


RTA00002687F.d03.lP.5iiq 


F 


M00039946B:F08 


CH14EDT 


1373 


404048 


RTA00002687F.f.24.2.P.Seq 


F 


M00O402O6A:A07 


CH14EDT 


1374 


404048 


RTA00002687FT.24. 1 .P.Seq 


F 


M00040206A;A07 


CH14EDT 


1375 


450627 


RTA 0000269 1 F.f.01 .iP.Seq 


F 


M00043405C:G02 


CHI7COHLV 


1376 


375589 


RTA00002680F.f.06.2.P.Seq 


F 


M00039794A:E04 


CH09LNL 


1377 


379011 


RTA0000268 1 F.n.23,2.P.Seq 


F 


M00039903CD01 


CH09LNL 


1378 


16789 


RTA00002709F,b.09. LP.Seq 


F 


M00005382B:F08 


CH02COH 


1379 


427346 


RTA00002665F.a.24.3.P.Seq 


F 


M00028066C:D07 


CH08LNH 


1380 


49540 


RTA000027 1 2F.e.O 1 . 1 .P.Seq 


F 


M00023399CE10 


CH04MAL 


1381 


14440 


RTA000O2674F.e. 14.2.P,Seq 


F 


M00O39!29C;D04 


CH09LNL 


1382 


391401 


RTA00002682F.k, 1 ! . I .P.Seq 


F 


M00040004O:803 


CH09LNL 


1383 


43782 


RTA00002662F.d.2 1 .2.P.Seq 


F 


M00007I65 B:G1I 


CH02COH 


1384 


212635 


RTA0OOO2666F.p.Ol . 1 .P.Seq 


F 


M00032688D:DI t 


CH08LNH 


1385 


15618 


RTA000027IOF.o.05,I.P.Seq 


F 


M000226S-A;C02 


CH03MAH 


1386 


1850! 


RTA00002669F.g.23.3. P.Seq 


F 


M000332!"B:H07 


CH08LNH 


1387 


400310 


RTA00002633F.b.05.2. P.Seq 


F 


M00040375C:B06 


CH14EDT 


1388 


403796 


RTA0O0G2687F.h, 1 7, 1 ,P,Seq 


F 


M00040293D:G04 


CH14EDT 


1389 


452314 


RTA00002694F,a,2 1 . 1 P.Seq 


f ; 


iVl0Q0434 !6C :A02 


CH20COHLV 


1390 


119179 


RTA000027 1 2F.k.20. ! .P.Seq 


F 


M00027021 A:G02 


CH04MAL 


1391 


167451 


RTA00002663 F.j. 1 1 . 1 .P.Seq 


F 


M00022646A:HIO 


CH03MAH 


1392 


450523 


RTA0000269 1 F.e. 1 9,2.P.Seq 


F 


M0004340 1 D;G08 


CHI7COHLV 


1393 


289535 


RTA 00002693 FT.06.I. P.Seq 


F 


M000432023:F01 


CH19COP 


1394 


374736 


RTAOOOOZ673 Fo.08.2.P.Seq 


F 


M00039I !23:C05 


CH09LNL 


1395 


378912 


RTA00002672F.n.0 1.2. P.Seq 


F 


M00039056CB05 


CH09LNL 


1396 


134877 


RTA0OOO2662F.d.O5.2.P.Seq 


F 


M00007026B:H09 


CH02COH 


1397 


3728 1 1 


RTA0O0O2670F.C. ! 2.2.P.Seq 


F 


MOOO3334"C;F02 


CH09LNL 


1398 


373296 


RTA00002672F.e,08.2,P,Seq 


F 


M00038994A:A!0 


CH09LNL 1 


1399 


373296 


RTA00002672F < e,G3.l. P.Seq 


F 


M000389Q4A:A!0 


CH09LNL 


1400 


452903 


RTA00002692F.t'.08.2.P.Seq 


F 


M00043060D:G12 


CHI SCON 


1401 


450067 


RTA0000269!F.c.l7,3.P.Seq 


F 


M00043352D:C03 


CH17COHLV 


1402 


451013 


RTA0000269IFT.08.I. P.Seq 


F 


M0004340*B:B03 


CH17COHLV 


1403 


212635 


RTA0O0O2666F.O.24. 1 .P.Seq 


F 


M000326SSD:Dll 


CHOSLNH 


1404 


452367 


RTA000O:692Fc.02.2. P.Seq 


F 


M00042976A:H04 


CH! SCON 


1405 


450627 


RTA0000269IF.e.24.l. P.Seq 


F 


M00043405C.-G02 


CH17COHLV 


1406 


186438 


RTA0O0O27I3F.U5.I. P.Seq 


F 


M00027462A:D07 


CH04MAL 


1407 


431066 


RTA00002669F.c.l7.3.P.Seq 


F 


M00033IS*D:F08 


CHOSLNH 


1408 


378912 


RTA0O00:6/2F.m.24.2.P.Seq 


F 


iM00039036C:B05 


CH09LNL 


1409 


1573! 


RTA00002709F.il 3.1. P.Seq 


F 


M00007! 16CG02 


CH02COH 


1410 


377(87 


'RTA00002683F.d.2!,2,P.Seq 


F 


M0004004"C:F05 


CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1411 


376107 


RTA00002677F ; ;L08.2.P.Seq 


F 


M00039333D:D09 


CH09LNL 


1412 


450580 


RTA0000269IF.c.20.3,P.Seq 


F 


M00043359CGOI 


CHI7C0HLV 


1413 


379942 


RTA00002679F .1.2 L LP.Seq 


F 


M00039707A:D02 


CH09LNL 1 


1414 


375589 


RTA00002680F.f.06. 1 .P.Seq 


F 


M00039794A:E04 


CH09LML 


1415 


! 375789 


RTA00002674F.X 1 6. 1 P.Seq 


F 


M00039I20C:H03 


CH09LNL 


1416 


456227 


RTA00002694F.C, 16, 1. P.Seq 


F 


M00043465CC09 


CH20COHLV 


1417 


455852 


RTA00002694F.3.02. 1 . P.Seq 


F 


M00042592A:H10 


CH20COHLV 


1418 


25169 


RTA000027 1 OF.m.05. 1 .RSeq 


F 


M00022579C:Cli 


CH03MAH 


1419 


376524 


RTA00002678F.h.23.2.P.Seq 


F 


M00039477A:B03 


CH09LNL 


1420 


449562 


RTA00002690F,b. 13,2. P.Seq 


F 


M00O42515CF08 


CHI6C0P 


1421 


! 449562 


RTA00002690F,b, !3.3.P.Seq 


F 


M000425I5CF08 


CHI6C0P 


1422 


286001 


RTA00002690F.b.08.2.P.Seq 


F 


M000425I1A;H04 


CH16C0P 


1423 


286001 


RTA00002690F.b.08.3,P.Seq 


F 


M0004251 IA:H04 


CHI6C0P 


1424 


380322 


RTA00002683F.p.2 1 . 1 .P.Seq 


F 


M00040106B:309 


CH09LNL I 


1425 


401603 


RTA00002685F.f.23.2.P.Seq 


F 


M000395IOCG02 


CH12EDT 


1426 


376541 


RTA00002678F.d. 1 3.2.P.Seq 


— F 


! M00039456A:C08 


CH09LML 


1427 


449123 


RTA00002690F.3L I3.3.P.Seq 


F 


M00042435A:All 


CHI6C0P 


1428 


41S358 


RTA00002686F.fn.07. 1 .P.Seq 


F 


M0O04O265D;BO7 


CHI3EDT 


1429 


380263 


RTA0O002689F.a.22. 1. P.Seq 


F 


! M00042543C;G04 


CHI5C0N 


1430 


455748 


RTA00002694F.b,06. 1. P.Seq 


F 


M0004342SD:G08 


CH20COHLV 


1431 


451679 


RTA00002693F.a.04.2.P.Seq 


F 


M000426I2D:F06 


CH19COP 


1432 


396332 


RTA00002686FJU 4.1. P.Seq 


F 


M00040252C:C06 


CHI3EDT 


1433 


377578 


RTA00002683F.b. 1 1 .2.P.Seq 


F 


M00040037A:EII 


CH09LNL 


1434 


20061 


RTA000027 lOF.m. 14. 1 .P.Seq 


F 


M00022597D.A06 


CH03MAH 


1435 


402494 


RTA00002686F.ru 1 6. 1 P.Seq 


F 


M00040191 A:B09 


CH13EDT 


1436 


372798 


RTA00002670F.C. 1 8.2.P.Seq 


F 


M00033349D:F05 


CH09LNL 


1437 


236295 


RTA00002679F,a, 1 9,2.P.Seq 


F 


M00039655B:H09 


CH09LNL 


1438 


451570 


RTA0000269|F.c.03.3,P,Seq 


F 


M00043340B:H03 


CHI7C0HLV 


1439 


35S47 


RTA00002708F.h.031.P,Seq 


F 


M00004239B:FI 1 


CHOICOH 


1440 


455706 


RTA00002694F.b. 1 0. 1 .P.Seq 


F 


M00043435B:G09 


CH20COHLV 


1441 


346310 


RTA00002684F.d. 1 8. 1 .P.Seq 


F 


M00040I22D:A02 


CH09LNL 


1442 


1 89561 


RTA00002676F.j.09.3.P.Seq 


F 


MO0O39308B:GO8 


CH09LNL 


1443 


403200 


RTA00002687Fj.24,l.p,Seq 


F 


M00040318A:B02 


CH14EDT 


1444 


401413 


RTA00002685F.i.03.2.P.Seq 


F 


M00039530B:E02 


CHI2EDT 


1445 


448680 


RTA00002690F.b.02,3.P.Seq 


F 


M00042440B:E09 


CHI6C0P 


1446 


117060 


RTA00002679F.h,24 4 l. P.Seq 


F 


M00039686C:C05 


CH09LNL | 


1447 


403200 


RTA000026S7Fj.24,2.P,Seq 


F 


M0004031SA:B02 


CH14EDT 


1448 


448589 


RTA00002690F.a.07.3,P.Seq 


F 


M00042349D:D07 


CHI6C0P 1 


1449 


373806 


RTA00002674F.O.02. 1 .P.Seq 


F 


M00039I79A:G09 


CH09LNL 


1450 


377055 


RTA00002682F.k. 13.1. P.Seq 


F . 


M00040005B:C1I 


CH09LNL | 


1451 


373111 


RTA00002670F.n. 1 4.2.P.Seq 


F 


M00033566C:E08 


CH09LNL 


1452 


12350 


RTA000027l3F.a.05.l. P.Seq 


F 


rVf00027l95C:E04 


CH04MAL 


1453 


450366 


RTA00002691F.c.06.3.P.Seq 


F 


M00043344D:E04 


CH17C0HLV 


1454 


397851 


RTA00002680F.b.04.2. P.Seq 


F 


MOU039775A:A09 


CHOOLNL 


1455 


403200 
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CHfiU MAI 


1758 


376896 


RTA00002677Fi.03.2.P.Seq 


F 


M00039402 B ; E03 


funoi mi 


1759 


376469 


RTA000026^4F,h.06. 1. P.Seq 


* F 


M00039 1 40D 1 404 


V, nv7Ui ' L. 


1760 


455147 


RTA00002694F.C.06, 1 .P.Seq 


' p 


M00O43458A:B 12 




1761 


375381 


RTA00002633F i a.02.2.P.Seq 


■ F 


M0004003IA:E06 


CH09LNL 


1762 


160196 


RTA 00002 663 F. t'A l.LP.Seq 


F 


M0002' r >34C*D06 




1763 


185945 


RTA000027 1 3F.b,2 1 . 1 .P.Seq 


F 


M00027232D;B08 


1 CH04MAL 


1764 


446139 


RTA00002639F.b. 1 3.3. P.Seq 


P 


M0004^65CA08 


! CH 1 ^CON 


1765 


379182 


RTA00002682F.C. 1 5. 1 .P.Seq 


P 


M000399^3BGOS 


1 rwnof mi 


1766 


376200 


RTA00002693F,f.08.2.P.Seq 


F 


M00043203A:B09 


CH lQ<"OP 
v— n i ~v_ \j r 


1767 


379506 


RTA0000268 1 F.c. I0.2.P.Seq 


P 


MO003985 irnn 


cnnoi Ml 


1768 


35715 


RTA00002708F.a.04.l. P.Seq 


— p— 


MOOOOI o6A H 1 t 


run i CC1U 


1769 


428500 


RTA00002665F.p.06. 1 .P.Seq 


— 


M0003" > 508B*HOi 


v. nuou^fi 


1770 


428812 


RTA00002667F.a. 10. 1. P.Seq 


F 


MO003' v 7PB'GO'* 


CHftSI MH 


I 1771 


378911 


RTA000026"2F.n.24.2.P.Seq 




M00039042B:B02 


CWOQI MI 


1772 


373697 


RTA000026"8F.d,0 1 .2.P.Seq 




M000 *94>4B- A 1 1 


CUOQi Ml 


1773 


372886 


RTA0OOO">6"0F b 2 P Seq 




M000"343C HOK 


CHOQI MI 


1774 


3789 II 


RTA000026~2F,o.0 1 2,P.Seq 




M0003904' ? BBO^ 


cwnoi MI 


1775 


122451 


RTA00002663 Fa. 1 2. 1 .P.Seq 




I'lWvWOVul/D.V. 1 1 




1776 


19867 


RTA000027 1 1 F.c. 13.1 -P.Seq 


F 


M000'>' , 856CAO7 


C HO 1 Vf A H 


1777 


37372 


RTA00002 "08 F.f.20. 1 .P.Seq 




M000O41*5D-4!O 


run irnH 


1778 


431419 


RTA00002669Fj.23.3.P,Seq 


F 


M00033">6IC'DP 




1779 


186360 


RTA000027 1 3F.a.2 1. 1 .P.Seq 


F 


M00027207BF07 


CH04MAL 


1730 


430751 


RTA00002669F j. 1 I.2.P.Seq 


■ "p 


M00033248A:B02 


CH08LMH 


1781 


372572 


RTA000026^0F * 20. 1 .P.Seq 


F 


M000334IOB:C09 


CH09LNL 


1782 


376913 


RTA00002683F.m.04.2.P.Seq 


p 


M00040089CE06 


CH09LNL 


1783 


376990 


RTA00002683 F.f.09.2. P.Seq 


. p 


M00040055D:B01 


CH09LNL 


1784 


58508 


RTA0000266 1 F.e. 17.1. P.Seq 


F 


M000037S6A:A 1 1 


CH01COH 


1785 


189139 [ 


RTA00O02664F.b. ! 4.2.P.Seq 


■p 


M0002685IB:F01 


CH04MAL 


j 1786 


384025 


RTA00002670F.k.20.2.PSeq 


F 


M00033454A:D09 


CH09LNL 


1787 


379126 


RTA00002683F.n.05,2.P,Seq 


F 


M00040092B:F05 


CH09LNL 


1788 


377633 | 


RTA00002684F.g. 1 5.2.P.$eq 


F 


M00040304B.F06 


CH09LNL 


1789 


430284 


RTA00002667F.k.06.1. P.Seq 


F 


M00032831C:G07 


CHOSLN'H 


1790 


374773 


RTA000026~6F.1.22.3.P.Seq 


F 


M000393I6A:C01 


CH09LNL 


1791 


403761 


RTA000026S7F.m.03,l.P.Seq 




M0O040327B:G06 


CHS4EDT 


1792 


375547 


RTA000026""7F.m,04.2.P.Seq 




M000394I7A:E12 


CH09LNL 


1793 


80436 1 


RTA0000266 1 Fx,09, 1, P.Seq 




MOOOOI582A:E02 


CH0ICOH 


1794 


189139 


RTA00002664F.b. 14, ! .P.Seq 




M00026S51B:F01 


CH04MAL 


1795 


376614 


RTA000026 r 7F,c.05.2.P.Seq 




M00039341D:D07 


CH09LNL 


1796 


404513 


RTA000026SSF.d, 13.1. P.Seq 




M00040390B:F02 


CH14EDT 



WO 01/02568 



PCT/USOO/18374 



1797 


375714 


RTA000026?7F.mJ3.2.P.$eq 


F 


I M000394I7GG01 


("HOQI Mf 


1798 


51564 


RTA0O0027 1 2F.d.23 . 1 P.Seq 


F 


I M000" , 3398B DP 




1799 


399551 


RTA00002687F. f. 1 3,2. P.Seq 


F 


MOOOUO^OIfyHli 


l«.ri i -*cL/ 1 


1800 


133512 


RTA00O02693Fe.242.PSeq 


F 


M0004 3^00 A ■ HnQ 


r*w i oprio 
\- n . v^v^r 


1801 


375176 


RTAQ0002675F.p. 1 3 . LP. Seq 


F 


iVI00039 J '66 D- Hn«l 

I'lvuu j 7_yu u.nlrv 




1802 


375704 


RTA00002676F.h. l3.2.P.Seq 


F 


M00039300C G04 


fwnoi Mr 


1803 


399551 


RTA00002687F.t 13,1 P.Seq 


F 


M00O40^OiDHI I 


ru 1 ipnT 


1804 


403357 


RTA00002687Fj.05.2.P.Seq 


F 




r*u i i chT 
v.n l-*tu 1 


1805 


34513 


RTA00002709F.C.22- 1 -P.Seq 


F 


MOQOOSSS^AA 10 




t 1806 


I2IS71 


RTA000027 1 3 Fa. 09. 1 .P.Seq 


F 


itiWv« / I70D.DvO 




1807 


32095 


RTA00002662F.d. I5.2.P.Seq 


F 


M000071 ITRlfl 

IvlwwwW f f IbViiOIV 




1808 


403183 


RTA00002687F.n.02. 1 .P.Seq 


F 


M 00040 i 17 fV Rrt ^ 


Pu i icnf 


1809 


168691 


RTA0O002663F j.02, 1 .P.Seq 


F 


M000^6l5DO05 




1810 


430854 


RTA00002668F.p-3 J ,2.P.Seq 


F 


Mooo , >'ii7^n'rrti 

I'lwvwJJ 1 / JU-VV I 




1811 


377987 


RTA00002679F.h.08. 1 .P.Seq 


F 


mooo 396X7 a -rnx 


piJAOl Ml 


1812 


428408 


RTA 00002665 F d 23 1 P Sea 


F 






18)3 


375930 


RTA00002677F.h.03.2.P,Seq 


F 


M00039396D'AOd 


pwnoi Mf 


1814 


28453 


RTAOOOO^l 1 F h 07 1 P Sea 


F 


I'lWU.JU/trt.D 1 I 


f writ via u 


1815 


1 19478 


RTA 0000^686 F n 07 1 P Sea 


F 


VfOno *ifv> 7 1 p ■ nn k 

l»l Ww*tw« / 1 V_ . L/VO 


f Li : t enr 
l-rt : ;tU 1 


1316 


403 1 89 


RTA0OO0**687F « 16 ^ P Sea 


F 




cut ' icnT 


1817 


129692 


RTA0000**679F e 13 1 P Sea 


F 

r 


MOftft "i0/»7"? A'FrtO 




1818 


86663 


RTA0000 o 664F a 10 1 P Sea 


F 




run i vj v ! 


1819 


403357 


RTA0000 7 687F i 05 1 P Sea 


F 

r 


Mfloo n - r. rn 


L rl 1 4bU I 


1820 


373 198 


RTA00O0*>670F d 01 ^ P Sea 


F 


m ftnn »'? 7 » H' r.n^ 

1VIUUU J J J t oL/^UUi 


/^fJ/\Q| VJI 


1821 


373198 


RTA0000*>670F o °4 ^ P Sea 


F 






1822 


25233 


RTA0000* 1 71 1 F b 06 1 P Sea 


F 






1823 


403429 


RTA00002687F.a.07.2.P,Seq 


F 




r*w i icnT 
u n 1 4tu i 


1824 


4171 19 


RTA000C686F i 14 1 P Sea 

IV * f%W WW W— p V VU 1 . 1. It. I.I .hJl>U 


F 

r 






1825 


376066 


RTAOOOO^SOF c P 1 P Sea 


F 


l» 1 w vw J "(O 1 U-/ .U 1 v 


rwnoi MI 


1826 


403189 


RTAO0OO n 6S7F 2 16 I P Sea 


F 


M00O4ft">i7n*Rn7 

!▼! WV — Wm 1 / U.OUf 


ru 1 ICHT 

\„ n ( 4cu i 


1827 


403429 


RTACKXKP687F a 07 1 P Sea 


F 




ru i ipnT 
n 1 4tu i 


! 1828 


430975 


RTA00002669Fj,06.3.P.Seq 


F 


M0003 »^46r-F03l 




1829 


427544 


RTA00002665F.e.03. 1 .P.Seq 


F 


M000" I S354A*BP 


rwn^i MH 


I 1830 


401 155 


RTA00002685F.0. 1 2. 1 .P.Seq 


F 


M000 ^9630 A COX 


v. r. i _ i 


1831 


377005 


RTA00002682F.k. 1 5, 1 .P.Seq 


F 


M 0004 000 5 D- R07 


rWfiOl Nil 


1832 


379032 


RTA00002683F.jl07. 1 .P.Seq 


F 


M0004003^ A ■ D09 

» T iwvw^Ww JZtr\* kJ\J7 


rHOQl Ml 


1833 


400097 


RTA00002685F.g. 19.2.P.Seq 


F 


M0003952 1 A A0"> 


fHPFDT 


1834 


383401 


RTA00002670F.k. 13.2.P.Seq 


F 


M00033450C:A02 


CH09LNL i 


1835 


379032 


RTA00002683F.a.07.2.P.Seq 


F 


M00040032A:D09 


CH09LNL 


1836 


429663 


RTA00002667F.m.2 1 . 1 .P.Seq 


F 


M000328648:B09 


CH08LNH 


1837 


374018 


RTA00002672F.a.l4.2.P.Seq 


F 


M00038632OB09 


CH09LNL 


1838 


375409 


RTA00002678F,n.02.2.P.Seq 


F 


M00039616B:C01 


CH09LNL 


1839 


401155 


RTA0000268 5 F.o, 1 2.2. P. Seq 


F 


M0003%30A:C08 


CH12EDT 


1840 


13958 


RTA00002711F.b.02.1. P.Seq 


F 


M00022817A:H02 


CH03MAH 


1841 


38767 


RTA00002687F.a.Il.LP.Seq 


F 


M00039748C:Fll 


CH14EDT 


1842 


29398 


RTA00002663 F.c.23 LP. Seq 


F 


M000220I5B:B07 


CH03MAH 


1843 


12453 


RTA00002709F i c.23.2.P.$eq 


F 


MO00O5556B:DO2 


CH02COH 


1844 


38767 


RTA000026S7Fa.IMP.Seq 


F 


M00039748C:F11 


. CH14EDT 


1845 


279885 


RTA0000267 1 FJ.05.2-P.Seq 


F 


M0003S279C:A1! 


CH09LNL 



(CI 



WO 01/02568 PCT/US00/18374 



1 1846 1 188592 


RTA00002664F.C 1 8.2.P.Seq 


F 


M0002714ICH03 I 


CH04MAL 1 


[ 1847 1 376469 


RTA00002674F.h.06.2.P.Seq 


F 


M00039140D:A04 | 


CH09LNL 



WO 01/02568 



PCT/US00/18374 















1U 


CLUSTER 


SEQ NAME 


ORIENTATE 


I CLONE ID 


LIBRARY 


I 


10600 


RTA00002S9lF.j.07.|.P.Scq 


F 


M0OOO375SB:DO7 


CHOICOH 


2 


18327 


RTAO0OOZ900F.p. 12. l.P.Seq 


F 


M0OO0541 3D: A05 


CH02COH 


3 


1759 


RTAO0OO:923Rf' 23. l.P.Seq 


F 


M0OO39248C:A0S 


CH09LNX 


4 


10924 


RTA00002907F.k. 1 2. l.P.Seq 


F 


( M00022224A.C07 


CH03MAH 


5 


45331 


RTA00002903F.I, 10. t .P.Seq 


F 


M0OOO7037D:Dl0 


CH02COH 


i 6 


42233 


RTA000029 1 2F.g.24. 1 P.Seq 


F 


M00027359B:A06 


CH04MAL 


7 


7211 


RTA00002909F.h.06. LP.Seu 


F 


MOO022634A:C07 


CH03MAH 


8 


21395 


RTAO0OO2390F.k.l6-LP.Seq 


F 


M00001637D:C12 


CHOICOH 


9 


3093 


RTAO0OO2923F,e,03. LP.Seq 


F 


M00039225A:Dll 


CH09LNL " 


10 


15806 


RTA00002S94F.f.07. LP.Seq 


F 


M00003991A:Cll 


CHOICOH 


U 


19739 


RTA00002S96F.d, 12. l.P.Seq 


F 


M00004147CEOI 


CHOICOH 


12 


1 140879 


RTA00Q02905Fe.L7.LP.Sea 


F 


M0O0O79S5C:D0S 


CH03MAH 


13 


29706 


RTA00002908F.I.22. LP.Seq 


F 


M0O022437B;A08 


CH03MAH 


14 


109581 


RTA000029 ISF.i.OS. LP.Seq 


F 


M0OO329OSA:DO3 


CHOSLNH 


1 15 


25009 


RTA00002906F.k. 1 1. l.P.Seq 


F 


M0O022016B:F01 


CH03MAH 


16 


8328 


RTAO00O2SSSF.e.07. 1 .P.Seq 


F 


M00001451CEIO 


CHOICOTT^ 


17 


15045 


RTA00002SS7F.e.06. LP.Seq 


F 


M0OOO!393C:E0S 


CHOICOH 


IS 


21216 


RTA00002393F.p.22. 1 .P.Seq 


F 


M00004416B:GIO 


CHOICOH 


19 


185754 


RTA00002912F.L09. LP.Seq 


F 


M0OO275O6B:G0l 


CH04MAL 


20 


11381 


RTAO00O29O9Rh. 10. LP,Seq 


F 


M0OO22b3SA:D03 


CH03MAH 


21 


185989 


RTA000029lOF.h.l2.1.P.Sea 


F 


M00022924C:F04 


CH03MAH 


22 


9667 


RTAOOOO29:3F.a.03. LP.Seq 


F 


M00039l62D:CCU 


CH09LNL 


23 


15817 


RTA00002903F-O-03. LP.Seq 


F 


M0O0O71O3D:C02 


CH02COH 


24 


10193 


RTA00002923F.j-09. LP.Seq 


F 


M00039294C-BC9 


CHGOLNL 


25 


6355 


RTA00002S94F.p. 12. LP.Seq 


F 


M000040f5D:D05 


CHOICOH 


26 


12227 


RTA00002909F.e. 1 8. 1 .P.Seq 


F 


M00022601B:G06 


CH03MAH 


27 


11047 


RTA00002S93F.O.06. l.P.Seq 


F 


M00003960D:C12 


CHOICOH 


2S 


1870 


RTA000029 lOF.m.08. l.P.Seq 


F 


M00023020C:H03 


CH03MAH 


:q 


20065 


RTA0000290SF.m.09. l.P.Seq 


F 


M00022-i\MA;A0S 


CHQ5MAH 


30 


19454 


RTAOO002900F.m.23. l.P.Seq 


F 


M000053'9A:D10 


CH02COH 


31 


48048 


RTA00002922F.m. 1 3. 1 .P.Seq 


F 


M0OO39124D:H0t 


CH09LNL 


32 


19799 


RTAQOQQ29GSFh.l9.LP.Seq 


F 


MOOO22-U9D:F0S 


CH03MAH 


33 


135562 


RTAOC00291 lF,m.07. LP.Seq 


F 


MOOO27093A;HO: 


CH04MAL 


34 


24214 


RTA00002S9 1 Fk. 19. 1 P.Seq 


F 


M00003 _ 64D:F0^ 


CHOICOH 


35 


5172- 


RTAOO00290SF.P.22. LP.Seq 


F 


M00022525B:D09 


CH03MAH 


36 


50495 


RTAOO0O2S9SF.C. 1 6. 1 .P.Seq 


F 


M00O0432IC:CU 


CHOICOH 


37 


43287 


RTA0000290SF.k. 16. 1 .P.Seq 


F 


M000:24^0D:B0: 


CH03MAH 


3S 


15324 


RTAOOOO:905F.p.20. 1 .P.Seq 


F 


mooo:i^)-C:Bo- 


CH03MAH 


39 


22157 


RTA000O2SSSF.2.07. LP.Seq 


F 


MCOOOU61D:BIO 


CHOICOH 


40 


15249 


RTAOOC02915FXQS, LP.Seq 


F 


M0OO:>:469B:Gi: 


CH03LNH 


41 


2764 


RTA00002925F.C. 1 L LP.Seq 


F 


M00039$:9B:E01 


CHC9LNL 


42 


23338 


RTAQ0002SS9F.b, 14, LP.Seq 


F 


MOOOOLMSB:DiO 


CHOICOH 


43 


11074 


RTA00002S99F,g.22. LP.Seq 


F 


MOO004cO3C;CL0 


CHOICOH 


44 


18367 


RTA000029::F.b.09. LP.Seq 


F 


M00O3So:9D:Ci: 


CH09LNL 


42 


2 1703 


RTAOO0O_9LVF.m,06.LP.beq 


F 


MO0O0"O59B;D0" 


CHOXOH 


46 


21470 


RTA00002 $95 F.c. 14. l.P.Seq 


F 


MO0O04L\vB:D03 


CHOICOH 


47 


15492 


RT AOOOO29O~F.p.0b. LP.Seq 


F 


M000:::S2B:C09 


CH03MAH 


43 


4022 


R7A00002$9T.i.22. LP.Seq 


F 


M00004:t>9B:BLU 


CHOICOH 


49 


21579 


R T AOO0O:S^ i F.e.03. 1 .P.Seq 


F 


M00001oS0B:H0l 


CHOICOH 


50 


IS62S3 


RT A00u>29 [ 3F.C.06. 1 .P.Seq 


F 




CHO-iMAL 



PI 



WO01/02S68 



PCT/US00/18374 





tLUo I cK 






LLUiNt IJJ 


LIBR.ARY 


51 


5410 


RTAO0O0292 1F...Q8. 1 P.Seq 


F 


M00033445D:G03 


CHO^LtC^ 


52 


22420 


RTA0000290 1 F.e. 19. 1 P.Seq 


F 


M00005474C:H09 


CH02COH 


53 


140553 


RTAO0O029 l6Rn,02, 1 .P.Seq 


F 


M0OO32638B:F02 


CH08LNH 


54 


23849 


r-ft nr* ft nnAAAA Air^ ^ t 

RTAO00O2887F.a.22. 1 .P.Seq 


F 


M0O0Ol386B:Fll 


CH01COH 


55 


21945 


RTA0000289DF.L22. LP.Seq 


F 


M00004103C;EIO 


CH01COH 


56 


7867 


RT A0000290 1 F.p.08 . 1 . P . Seq 


F 


M00005710B:H03 


CH02COH 


57 


14533 


rS W 4 AAAMhA/n 1 /"ft 1 1 'Tft 

RTA00002896F.LOL 1 .P.Seq 


F 


. M00004179G:B06 


CH01COH 


58 


5790 


RTA000O29 l9F,g. 1 7. 1 .P.Seq 


F 


M0OO33080C:A07 


CH08LNH 


59 


186153 


RTA000029 11 F .i.24. 1 .P.Seq 


F 


M000270l7A:B09 


CH04MAL 


60 


10561 


RTA00002S99F,h,08. 1 .P.Seq 


F 


M00004606D:H09 


CH01COH 


61 


24572 


RTA0O002893F.L08.1. P.Seq 


F 


M00003926A:FU 


CH01COH 


62 


13138 


RTA0O0O2888F.m.03. 1 .P.Scq 


F 


M0O0O1488C:A03 


CH01COH 


63 


6701 


RTA00002922Rg. 1 8, 1 ,P,Seq 


p 


M00039055C:A01 


CH09LNL 


64 


12751 


RTA00002904F.C. lO.LPSeq 


F 


M00007202B:FOl 


CH02COH 


65 


3583 


RTA000029 1 6F.n.2 L L P.Seq 


F 


M00032644C.B05 


CH08LNH 


66 


12673 


RTA0000290 1 F.d.24. 1 .P.Seq 


F 


MO0OO5463A:G02 


CH02COH 


67 


15243 


RT A0000290 1 F.l .2 1 . 1 .P.Seq 


F 


M00005623B:G01 


CH02COH 


68 


21022 


RTA00002922F.k.24. 1 P.Seq 


F 


M00039IUA:C12 


CH09LNL 


69 


36596 


RTA00002919F.g.24. l.P.Seq 


F 


M00033081D:D11 


CH08LNH [ 


70 


4932 


RTA00002S90F.C14. 1 .P.Seq 


F 


M00001596A:D02 


GH01COH 


7i 


42413 


RTA00002900F,o. 14. 1 .P.Seq 


F 


M00005401D:F09 


CH02COH 


72 


1090 


RTA000029 1 8F.g.20. 1 .P.Seq 


F 


M00032892C:C12 


CH08LMH 


73 


44737 


RT A0OO0290 1 F.a.20. l.P.Seq 


F 


M00005434A:C03 


CH02COH 


74 


41S3 


RTA0O0029 1 8F.n,23. 1 P.Seq 


F 


M00O3298SB;G0l 


CH0SLNH 


75 


41882 


RTA00002902F.d. 12. 1. P.Seq 


F 


M0OO065S6D:DO4 


CH02COH 


76 


500 


RTA0O002925F.O. 18. 1, P.Seq 


F 


M00040034A:E06 


CH09LNL 


77 


5435 


RTA00002921F.L20. IP. Seq 


F 


M00033420B;E08 


CH09LNL 


78 


15829 


RTA00002900F.J.O i . 1 P.Seq 


F 


% #AAAA -l | | 4 y^S % ^\ 

M0000d314A;G10 
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CH01COH 
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p 


- M00003822D:A02 


CHOICOH 


543 


24633 


RTA00O0:907F,i. 19. 1 .P.Seq 


F 


MOO022208B:D03 


CH03MAH 


544 


72031 


RTA(X)00:925F.k.03.1,P.Seq 


F 


M00039929D:H10 


CH09LNL 


545 


5991 


RTA0000:9L6F.ia7.1.PSeq 


F 


M00032597A:H02 


CHOSLNH 


546 


14596 


RTA000029 1 lF.n.15. 1 P.Seq 


F 


M00027131A:B03 


CH04M.AL 


547 


6923 


RTA0OOO:S96F.d.0i.l.P.Seq 


F 


MCO004146B:E08 


CHOlCOH 


548 


6923 


RTA()OO0:S96Fc.24.l.RSeq 


F 


M00004146B:E08 


CHOICOH 


549 


2185 1 


RTA0OOO:8S7F.d.O9. 1 .P.Seq 


F 


M00001391D:D03 


CHOICOH 


550 


3935 


RTAOu00:925F.j.08. 1 .P Seq 


F 


M00039921C:Hil 


CH09LNL 
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seq name 
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CLONE ID 


LIBRARY 


551 


13328 


RTA00OO2909F.h.08. LP.Seq 


F 


[ M00O22634B;HO9 


CH03MAH 


552 


2492 


RTA00002S37F.e. 11 . 1 .P.Seq 


F 


! M00001393D:E02 


CH01COH 


553 


11960 


RT A000029 1 7F.b-03* 1 -P.Seq 


F 


M00032671B:D08 


CH08LNH 


554 


i 1 86084 


RTA00OO29 12F.f. 18. l.P.Seq 


F 


M00027319D:F07 


CH04MAL 


555 


13644 


RTA0OO02925F.a.09. 1 P.Seq 


F 


M00039805B:B06 


CH09LNL 


556 


5707 


RTA00002909F.k. 13.1 P.Seq 


F 


M00022672CH04 


CH03MAH 


557 


95700 


RTA0O00291 IF.p. 14. l.P.Seq 


F 


M00027182B:G06 


CH04MAL 


558 


342 


RTA00002922F i.23. 1 P.Seq^ 


F 


M00039076D:G04 


CH09LNL 


559 


848 i 


RTA00002837F.C. 12. 1 .P.Seq 


F 


M00001389D:D06 


CHOICOH 


560 


12575 


RTA000029 16F.i. 12. 1 .P.Seq 


F 


M00O32594C:F05 


CH08LNH 


561 


40712 


RTA0000292 lF.ci.08.LP.Seq 


F 


M00033359CH05 


CH09LNL 


562 


10768 


RTA00002S86F.d.24. 1 .P.Seq 


F 


M00001346B:GU 


CHOICOH 


563 


38781 


RT A0O0O2889F.k.23 .LP.Seq 


F 


M00001559A:H09 


CHOICOH 


564 


8790 


RTA0O0O2888F.e.08. LP.Seq 


F 


M000Ol46lD:Cl0_ 


CHOICOH 


565 


10167 


RTA000029 1 6F. k.22 : LP.Seq 


F 


M00032621A:F11 


CH08LNH 


566 


I 13706 


RTA00002905F.e.2 1 . 1 P.Seq 


F 


M0000S0l9B:A0l 


CH03MAH 


567 


124172 


RTA00002900F.a.09. 1 .P.Seq 


F 


M00004S24A:D12 


CH02COH 


563 


92126 


RTA000029 lOF.g. 12. l.P.Seq 


F 


M00022904C:D04 


CH03MAH 


569 


5830 


RTA00002916FJ.09. l.P.Seq 


F 


M00032605B:D09 


CH08LNH 


570 


15154 


RTA00002S86F.p. 13. l.P.Seq 


F 


M00001382D:A07 


CHOICOH 


571 


25813 ! 


RTA000029 I OF. i, 12. LP.Seq 


F 


M00022952A:B02 


CH03MAH 


572 


17268 


RTA00002S86F.d.07. 1 .P.Seq 


F 


M00001344D:E08 


CHOICOH 


573 


13684 


RTA000029 l5Fi09. LPSeq 


F 


M00031485B:G05 


CH08LNH 


574 


13460 


RTA00002898F.t\ 19. l.P.Seq 


F 


M00004341OA09 


CHOICOH 


575 


25115 


RTA000029l9F.p. 13. LP.Seq 


F 


M000333UB:G10 


CH08LNH 


576 


19949 


RTA000029Q5F.e. 17.1 .P.Seq 


F 


M0O00SO16B:E09 


CH03MAH 


577 


24266 | 


RT A000029 l7Rk.06. 1 .P.Seq 


F 


M00032759A:A03 


CH08LNH 


578 


8243 


RTA00002901F.0. 17. LP.Seq 


F 


M000057O3B:E03 


CH02COH 


579 


12576 


RTA00002900Fk.23, 1 .P.Seq 


F 


MO00O5359B:B0S 


CH02COH 


580 


28531 


RTA00002909F.C.04. i .P.Seq 


F 


M00022559D:G10 


CH03MAH 


58t 


15153 


RTA00002894F.O.2 1 . 1 P.Sea 


F 


M000O4054A:D03 


CHOICOH 


582 


9493 


RTA00002894F.e,04, 1 P.Seq 


F 


M00003985D:B02 


CHOICOH 


583 


48140 


RT A000029 1 4F.h. 13.1 .P.Seq 


F 


M0002821IA:F10 


CHOSLNH 


584 


7626 


RTA0OOO2895F.b.O4. 1 .P.Seq 


F 


M00004061B:E05 


CHOICOH 


585 


22668 


RTA00002896F.p. 17. l.P.Seq 


F 


M00004204C:H08 


CHOICOH 


586 


45691 


RTA00002903F.a. 11.1 P.Seq 


F 


M00022305A:B04 


CH03MAH 


587 


30429 


RT A00002904F.a. 1 9. 1 P.Seq 


F 


M00007155D:C09 


CH02COH 


588 


46969 


RTA00002909F.2.02. LP.Seq 


F 


M0002261SC;E04 


CH03MAH 


589 


44030 


RTA00002900Fo T 23. 1 P.Seq 


F 


M00005405C:DOI 


CH02COH 


590 


142548 


RTA00OO2905F,h. 10, l.P.Seq 


F 


M00008073A:D01 


CH03MAH 


591 


18455 


RTA0OO02905F.g. IS. l.P.Seq 


F 


M00008059B:F0S 


CH03MAH 


592 


7501 


RTA00002S94F.2.05. 1 .P.Seq 


F 


M00003993D:B03 


CHOICOH 


593 


7280 


RT A00002 893 F.n. 22. LP.Seq 


F 


M00003959D:A05 


CHOICOH 


594 


19339 


RTA0O0O2S98F.1. 12. LP.Seq 


F 


M00004376D;A12 


CHOICOH 


595 


30194 


RTA00002922F.k.05.LP.Seq 


F 


M00039100A:G04 


CH09LNL 


596 


32650 


RTA0000291 lF.i.05. LP.Seq 


F 


M00026994D;D07 


CH04MAL 


597 


10510 


RTA0Q002905F.d.i7, LP.Seq 


F 


M0000S001B:F05 


CH03MAH 


598 


13539 


RTA00002S98F.1'. 03. LP.Seq 


F 


M000O4336A:AOl 


CHOICOH 


599 


20149 


RTA00002917F.O.03. 1 .P.Seq 


F 


M00032791D:F01 


CHOSLNH 


600 


12780 


RTA00002S9lF.e.O() s LP.Seq 


F 


M000016S6D:F06 


CHOICOH 
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601 


182479 


RTA00002910F.j. 18. LP.Seq 


F 


M00022972C:E05 


CH03MAH 


602 


14016 


RTA0O0O2923F,n.l 7, LP.Seq 


F 


M00039344C:All 


CH09LNL ; 


603 


76075 


RTA0OO02890F.h.23. LP.Seq 


F 


MOOOOl620B:A03 


CH01COH 


604 


9806 


RTA00002922F.j.05. 1 .P.Seq 


F 


M00039078D:C10 


i CH09LNL 


605 


9036 


RTA00002889F,k. 15. LP.Seq 


F 


M00001553A:E06 


CHOICOH 


606 


2619 


RTA00002907F.O 12. LP.Seq 


F 


M00022269C:A04 


CH03MAH 


607 


17517 


RTA00002907F.H. 06. LP.Seq 


F 


M00022185A;B03 


CH03MAH 


608 


5089 


RTA000029 1 5F.e.22.2.P,Seq 


F 


M00023777B:G04 


CH08LNH 


609 


6728 


RTA00002904F.b. 13. LP.Seq 


F 


M00007178A;C02 


CH02COH 


610 


41149 


RTA00002899F.g.20. LP.Seq 


F 


M000O4603B:EO2 


CHOICOH 


611 


35017 


RTA00002S92F.f.03.2,P.Seq 


F 


M000O3812C:A03 


CHOICOH 


612 


7008 


RTA00002923F.a.08. 1 .P.Seq 


F 


M00039165D:C04 


CH09LNL 


613 


185545 


RTA000029 12F.k. 16. 1 .P.Seq 


F 


M00027480OE09 


CH04MAL 


614 


17840 


RTA00002892F.p. l5.2.P.Seq 


F 


M00003854B:F07 


CHOICOH 


615 


185914 


RTA000029 12F.j.24. LP.Seq 


F 


M00027467A:C07 


CH04MAL 


616 


6862 


RTA00002903F.b.08. 1 .P.Seq 


F 


MOOOO6872D:B07 


CH02COH 


617 


20120 


RTA00002888F.C.24. LP.Seq 


F 


M00001445B;F06 


CHOICOH 


618 


_ 20120 


RTA0OOO2338F.d-0 1 . 1 .P.Seq 


F 


M00001445B:F06 


CHOICOH 


619 


13879 


RTAOOOO2923Fd-02. LP.Seq 


F 


M00039207A:F07 


CH09LNL 


620 


9330 


RTAO00O29 15F.g. 16. 1 P.Seq 


F 


MO0028786B:A04 


CHOSLNH 


621 


21572 


RTA0000292 IF.h. 19. LP.Seq 


F 


M00033441A:B12 


CH09LNL 


622 


2943 


RTA000029 1 9F.h.22. 1 P.Seq 


F 


M00033144A:D02 


CHOSLNH 


623 


32154 


RTA00002905F.b. 16, 1 .P.Seq 


F 


M00007969D;C01 


CH03MAH 


624 


20875 


RTA0000290 IF.k. 16. 1 .P.Seq 


F 


M00005603B:H03 


CH02COH 


625 


186324 


RTA000O29 1 2F d- 17. 1 P.Seq 


F 


M00027274A:A09 


CH04MAL 


626 


10768 


RTA00002886F.e.0 1 . 1 .P.Seq 


F 


M00001346B:G11 


CHOICOH 


627 


16711 


RTA00002935F.m. 1 LI P.Seq 


F 


M00055221C:H11 


CH17C0HLV 


628 


14688 


RTA00002925F,n. 14.LP.Sec) 


F 


M00040023B:B10 


CH09LNL 


629 


44419 


RTA00002907F.b. 19. 1 .P.Seq 


F 


M00022 H3A:E06 


CH03MAH 


630 


12614 


RTAO0OO2S96F.p.03. 1 .P.Seq 


F 


M00OO4201D:C0I 


CHOICOH 


631 


21658 


RTA00002902F.C.23. LP.Seq 


F 


M0OOO6576D:C02 


CH02COH j 


632 


10150 


RTA00002901F.L 16. LP.Seq 


F 


M00005540A:F09 


CH02COH 


633 


185909 


RTA000029 1 2F.C.20. 1 .P.Seq 


F 


M0OO27262A:A07 


CH04MAL 


634 


14893 


RTA00002S90F.LOS. LP.Seq 


F 


M00001607D:H09 


CHOICOH 


635 


32125 


RTAOOOO2903F.C.0S. 1 P.Seq 


F 


M000068S4D:A08 


CH02COH 


636 


11909 


RTA00002902F,a. 1 LI P.Seq 


F 


M0OO05766D:D12 


CH02COH 


637 


17237 


RTA00002901F.HX LP.Seq 


F 


M00005616B:F07 


CH02COH 


638 


11148 


RTA00002900F j. IS. t P.Seq 


F 


M00005346D:A03 


CH02COH 


639 


14837 


RTA00OO2925F.ri.20, LP.Seq 


F 


M00040025A:B04 


CH09LNL 


640 


4343 


RTA00002S97F.1. 1 3. 1 P.Seq 


F 


M0OO042S2B.D07 


CHOICOH 


641 


186S6 


RTA00002898FJ. 16. LP.Seq 


F - 


M00004366D:Cll 


CHOICOH 


642 


10090 


RTA00002S92F.n. 10.2.P.Seq 


F 


M00003S42D:H09 


CHOICOH 


643 


612 


RTA00002S89F.d. 13. 2. P.Seq 


F 


M00001535B:E02 


CHOICOH 


644 


10752 


RTA00002S92F.n.06.2.P.Seq 


F 


M00003S42D:D1L 


CHOICOH 


645 


167203 


RTA000029 UF.c. 14. 1 P.Seq 


F 


M0002S070A.H09 


CHOSLNH 


646 


21269 


RTA0000290 1 F.j. 15. 1 .P.Seq 


F 


MO0OO5570A:B0S 


CH02COH 


647 


186250 


RTA000O29lOF.a.2l. LP.Seq 


F 


M0OO22797D;A06 


CH03MAH 


648 


24633 


RTA00002907F.i.l9.2P.$eq 


F 


MO0O222OSB:D03 


CH03MAH 


649 


12295 


RTA000029 1 SF.c.02. 1 P.Seq 


F 


M00032$36B:A07 


CHOSLNH 


650 


7870 


RTA00002905F ? b.22. 1 .P.Seq 


F 


M00007973B:D1I 


CH03MAH 
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651 


12225 


RTA00002902F.d.08. LRSeq 


F 


M00006585A:D07 


CH02COH 


652 


7775 


RTA00002S92F.O. i2-2,RSeq 


F 


M00003847A:H04 


CH01COH 


653 


14901 


RTA00002929F.L2L LP.Seq 


F 


M00040349D:D07 


CH14EDT 


654 


6831 


RTA00002927F.b.2 1 . 1 .P.Seq 


F 


M000394S3A:D10 


CH12EDT 


655 


10738 


RT A00002930F. b.OS . 1 .P.Seq 


F 


M00042724A:G06 


CH15CON 


656 


17986 


RTAOOO02932F.a.20. LP.Seq 


F 


M00042972CF04 


CH18CON 


657 


1 23163 


RTA00002895F.h.03. LP.Seq 


F 


M00004085A:H01 


CHOICOH 


658 


1 4838 


RTAOO0O2923F.i. 15. 1 .P.Seq 


F 


! M000392S4D:H07 


CH09LiNL 


659 


25386 


RT A00002905F.e.05. 1 P.Seq 


F 


M00008007B:E03 


CH03MAH 


660 


13217 


RTA0OOO2887Rn,01 . LP.Seq 


F 


I M00001422B:D06 


CHOICOH 


661 


30656 


RTA00002906FX03. LP.Seq 


F 


MOOO22032A:GO5 


CH03MAH 


662 


7852 


RTAO0002889F.e. 14. LRSeq 


F 


M0000153SB:A07 


CHOICOH 


663 


13217 


RTA00002S87F.m.24. LP.Seq 


F 


M00001422B:D06 


CHOICOH 


664 


15152 


RTA00002925R f. 24. LRSeq 


F 


M00039873B:H04 


CH09LNL 


665 


24143 


RTA00002922F.0. 18. LP.Seq 


F 


M00039143D«C10 


CH09LNL 


666 


23872 


RTA00O02892F.i. 13. LP.Seq 


F 


M000O3823B:A06 


CHOICOH 


667 


13940 


RTA0OOO2906F.g.23, LP.Seq 


F 


M00021967D:H06 


CH03MAH 


668 


25759 


RTA00002907F.m. 10. 1 .P.Seq 


F 


M00022249D:C0l 


CH03MAH 


669 


5761 


RTA00002924F.p.05. LP.Seq 


F 


M000397S6D;A10 


CH09LNL 


670 


41703 


RTA000029O IF.2.23. LP.Seq 


F 


M00005506D:E11 


CH02COH 


671 


7165 


RT A00002909F. i .06. 1 .P.Seq 


F 


M00022648A:D08 


CH03MAH 


672 


41492 


RTA00002SS9F.m. 18. LP.Seq 


F 


M0O001565A:H05 


CHOICOH 


673 


9331 


RTA00OO2906Rg, 10, LP.Seq 


F 


M0002I953B:E0S 


CH03MAH 


674 


7961 


RTA00002S87Rg,24. 1 .P.Seq 


F 


M00001399B:B01 


CHOICOH 


675 


15367 


RTA00002S93Rn. 1 7. 1 .P.Seq 


F 


M0000395SC:H08 


CHOICOH 


676 


185628 


RTA000029 1 2F.f . 17.1 .P.Seq 


F 


M00027319CC03 


CH04MAL 


677 


7386 


RTA00002891F.L 14. LP.Seq 


F 


M0000376SD:D08 


CHOICOH 


678 


67391 


RTAOOO02S93F.p.07. LP.Seq 


F 


M0000396SCG03 


CHOICOH 


679 


46380 


RTAO000:906F.f. 10. LP.Seq 


F 


M00021933B:F02_, 


CH03MAH 


680 


14265 


RTA00O02S92F.eX)5.2.P.Seq 


F 


M0000380SA:FII 


CHOICOH 


681 


186478 


RTA00002912F.f-.07. LP.Seq 


F 


M00027313C:E01 


CH04MAL 


682 


8192 


RTA000029 16Rm.07. LP.Seq 


F 


M00032634B:D09 


CH08LNH 


683 


13776 


RTA0OOO2925F.I. 10, LP.Seq 


F 


M00039976C:F11 


CH09LNL 


684 


11796 


RTA0000Z9 1 2F.e.02. LP.Seq 


F 


M0002729LA;G08 


CH04MAL 


685 


10827 


RTA000029 L9Ri. 10. 1 P.Seq 


F 


M00033147OB0S 


CH08LNH 


686 


1482 


RTA00002925F.1. 12. LP.Seq 


F 


M00039977B:D12 


CH09LNL 


687 


30300 


RTAO0O0:906F.f. 16. LP.Seq 


F 


M00021941A;D09 


CH03MAH 


688 


10454 


RTA00002S90F.i. 15. LP.Seq 


F 


MOO0Ol623D:AIO 


CHOICOH 


689 


16649 


RTA00002907F.I.OL I P.Seq 


F 


M00022229D:E0I 


CH03MAH 


690 


7026 


RTA00002SS7F.b. 10. LP.Seq 


F 


M000013S^B;AH 


CHOICOH 


691 


5691 


RTA00002S95F.n. 13. LP.Seq 


F 


M00004114C;Dil 


CHOICOH 


692 


13797 


RT A000029 1 SF.i .21.1 .P.Seq 


F 


M0003291SD:B04 


CH08LNH 


693 


5187 


RTA0000:923Rn.03. LP.Seq 


F 


M00039335B:F07 


CH09LNL 


694 


186115 


RTA00002912RLOL LRSeq 


F 


MO0027376C:A02 


CH04MAL 


695 


4826 


RTA000029 17F.2.24. LP.Seq 


F 


M00032729A:F10 


CHOSLNH 


696 


6733 


RTA000029L7Rm.lL LRSeq 


F 


M00032774C:C04 


CH08LNH ! 


697 


7604 


RTAOOOO:923F.j.05. LRSeq 


F 


MOO03929lD;F02 


CH09LNL 


698 


46459 


RT A00002905F. LO 1 . 1 R.Scq 


F 


MOOOOS0:0D:FO2 


CH03MAH 


699 


23385 


RTA00002SS9F.1.23. LP.Seq 


F 


MOO0O155lD:D0l 


CHOICOH 


700 


7516 


RTA00002S91Rh.lLLP.Seq 


F 


MO00O3749C:C0S 


CHOICOH 



WO 01/02568 



PCT/US00/I8374 
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ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLOI^E ID 


LIBRARY 


701 


45048 


RTA00002906F,b04T P.Seq 


F 


M00021855D:F10 


CH03MAH 


702 


14845 


RTA00002903F.O.02, l.P.Seq 


F 


M00007103OC12 


_ CH02COH 


703 


16479 


RTA000O2887F.i.l5.1P.Seq 


F 


M00001403D:C12 


CHOICOH 


704 


186729 


RTA0000291 LF.d.l9.2.P.Seq 


F 


M0OO2685OB:C09 


CH04MAJL 


705 


33658 


RT A00002S86F.j.07 . 1 . P.Seq 


F 


M00001361B:A12 


CHOICOH 


706 


186755 


RTA000029l2F.Ll8.1P,Seq 


F 


M00027400D:H02 


CH04MAL 


707 


4262 


RTA00002397F.a.04. t .P.Seq 


F 


M00004208A;D08 


CHOICOH 


708 


14039 


RTA00002897F.k.0 1.1 P.Seq 


F 


M00OO4276C:A08 


CHOICOH 


709 


11948 


RTA00002895F.n.24. IP.Seq 


F 


M00004118C:D12 


CHOICOH 


710 


14865 


RTA000O2908FX 14. 1 P.Seq 


F 


M00022481B:A04 


CH03MAH 


7U 


10779 


RT A000029 15F,i.0l.l P.Seq 


F 


M0003l370B:C0l 


CH08LNH 


712 


7503 


RTA00002902Rk. 16. l.P.Seq 


F 


M00006738A:F12 


CH02COH 


713 


48130 


RTA00002902F.e.04. 1 P.Seq 


F 


M0O0O6595B:Cl0 


CH02COH 


714 


7858 


RTA00002907F.m.i2.! P.Seq 


F 


M00022250A:B04 


CH03MAH 


715 


4682 


RTA00002924F.n.02. t. P.Seq 


F 


M00039710B:E01 


CH09LNL 


716 


20650 


RTA00002888F,p. 10. 1 P.Se<L 


F 


M00001503B:H10 


CHOICOH 


717 


25320 


RTA000029 lOF.e. 19- l.P.Seq 


F 


M00022857B:A09 


CH03MAH 


718 


4924 


RTA00002930F.g.O 1 2P.Seq 


F 


M00055805A:H02 


CH15CON 


719 


21170 


RTA00002900F.1. 13.1 .P.Seq 


F 


M0O0O5365A:F05 


CH02COH 


720 


9258 


RTA00002S90F.h. 17. l.P.Seq 


F 


M00001618C:DOI 


CHOICOH 


721 


14039 


RTA00002S97F.j.24. l.P.Seq 


F 


MOO0O4276CA0S 


CHOICOH 


722 


3483 


RTA00002S99F.b07. 1 P.Seq 


F 


M00004430A:A05 


CHOICOH 


723 


3877 


RTA00002S97F.IUO. l.P.Seq 


F 


M00004278A:G06 


CHOICOH 


724 


7483 


RTA00002923F.F. 19, l.P.Seq 


F 


M00039246B:A08 


CH09LNL 


725 


99750 


RTA00002900F.1. 17. 1 P.Seq 


F 


M00005366D:FOS 


CH02COH 


726 


46459 


RTA00002905F.e.24. 1 P.Seq 


F 


MOOOOS020D:F02 


CH03MAH 


727 


3591 


RTA00002S86F.O.09. l.P.Seq 


F 


MQOOOl373C:El0 


CHOICOH 


728 


11277 


RTA00002923F.*. 1 1. l.P.Seq 


F 


M00039251D:308 


CH09LNL 


729 


10292 


RTA00002S9SF.n. 1 1 . 1 P.Seq 


F 


M00004395C.D06 


CHOICOH 


730 


23211 


RTA00002922F.k. 13.1 P.Seq 


F 


M00039105D:A0S 


CH09LNL 


731 


185698 


RTA000029L lF.d.03.2P.Seq 


F 


M00026836B;H03 


CH04MAL 


732 


24702 


RTA00002S98F.i. 17.1 P.Seq 


F 


M00004360QD09 


CHOICOH 


733 


12595 


RTA00002904F.C.06. 1 P.Seq 


F 


M00007197B:B05 


CH02COH 


734 


177444 


RTA000029 10F.0.06, 1 P.Seq 


F 


M00023097D:B08 


CH03MAH 


735 


38147 


RTA00002S86F.0. 1 1 . 1 P.Seq 


F 


M00001379AP09 


CHOICOH 


736 


17909 


RTA0OO02908F.d.O9. 1 P.Seq 


F 


M00022386D;F10 


CH03MAH 


737 


13399 


RTA00002900F.n.07. 1 P.Seq 


F 


M00005385A:BL2 


CH02COH ! 


738 


17720 


RT A00O02905F.2.22. 1 P.Seq 


F 


MOO0O8065D:AO7 


CH03MAH 


739 


45974 


RTA00002909F.I.08- 1- P.Seq 


F 


M00022681D:E10 


CH03MAH 


740 


10779 


RTA000O:9 l5F.h.24.2P.Seq 


F 


M00031370B:C01 


CH08LNH 


741 


195 


RT A000029 1 4F.a. 1 4. 1 P.Seq 


F 


M00028055B:G07 


CHOSLNH 


742 


1712 


RTA000O2915F.j.07. l.P.Seq 


F 


MO0031484A:D03 


CH08LNH 


743 


185726 


RTA000029 1 2F.a.2 1 . 1 P.Seq 


F 


M00027215B:B12 


CH04MAL 


744 


150298 


RTA00002907F.d.20. 1 P.Seq 


F 


M00022140D:A07 


CH03MAH 


745 


358 


RTA00002S98F.i.02. 1 P.Seq 


F 


M000O435SB:GO2 


CHOICOH 


746 


42920 


RTA00002900F.L 16. 1 P.Seq 


F 


M0O0O5309B:All 


CH02COH 


747 


25681 


RTA0000:S94F.k. 18. 1 P.Seq 


F 


M00004038A;A04 


CHOICOH 


748 


18005 


RT A00002903F.C. 18,1 P.Seq 


F 


M00006890C;F10 


CH02COH 


749 


16143 


RTA00002S92F.rn.04.2P.Seq 


F 


M0O003839C:Hl0 


CHOICOH 


750 


9306 


RTAG0OO:902b\d,O9. 1 P.Seq 


F 


M000065S5A:F09 


CH02COH 



f ft 
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CLONTE ID 


LIBRARY 


751 


32293 


RTA00002901Ri. 13. LP.Seq 


F 


M00OO5535B:BOl 


CH02COH 


752 


8913 


RTA0000290 1 Fj,07 . LP.Seq 


F 


M00005557D:H10 


CH02COH 


753 


185819 


RTAO0002912F.a.2O. l.P.Seq 


F 


M000272t5A:F06 


CH04MAL 


754 


10559 


RTA00002S98Ro. 12. 1 .P.Seq 


F 


M00004406A:G09 


CH01COH 


755 


8740 


RT A00002923F.0. 11.1 ,P,5eq 


F 


M000393S3A;H07 


CH09LNL 


756 


160257 


RTA0O0O2907R!.l2.2.P.Scq 


F 


M00022237QE04 


CH03MAH 


757 


6078 


RTA0OOO2930F.cll-LP.Seq 


F 


M00O55433D:GO3 


CH15CON 


758 


L2543 


RTA00002927Rb. 14. 1 P.Seq 


F 


M0O039377B;EO5 


CH12EDT 


759 


9686 


RTA0OO02930FJ. 19.LP.Seq 


F 


M00055794A:EIO 


CH15C0N 


760 


3369 


RTA00002930Rb. 12. LP.Seq 


F 


M00O42732B-HO6 


CH15C0N 


761 


6891 


RTA00002S95F.i.03. 1 .P.Seq 


F 


M000040STC;E02 


CH01COH 


762 


13666 


RTA00002892Ri.05, LP.Seq 


F 


M0O0O3822C.AO9 


CHOICOH 


763 


6925 


RTA00002930F k 24. LP.Seq 


1 F 


M0005645SOE01 


CH15C0N 


764 


11351 


RTA00002901Rg- 15. LP.Seq 


F 


M00005504D:F06 


CH02COH 


765 


11497 


RTA00002889F.a.2 1 . 1 .P.Seq 


F 


M00001512D:F08 


CHOICOH 


766 


1596 


RTA00002922Rm, 1 8. 1 .P.Seq 


F 


M00039125D:H12 


CH09LNL 


767 


1865 19 


RTA00002924Ra,22. 1 .P.Seq 


F 


M00039411D:D09 


CH09LNL 


768 


24429 


RTA00002903F j.04. 1 P.Seq 


F 


M000069S9B:G05 


CH02COH 


769 


33795 


RTA00002902F.k. 1 8. 1 .P.Seq 


F 


M00006739B:A04 


CH02COH 


770 


24267 


RTA00002889F.1. 17. 1 .P.Seq 


F 


M0000156LD:H04 


CHOICOH 


771 


12536 


RT A00002S9 1 F.j.20. i .P.Seq 


F 


M00003760C:G10 


CHOICOH 


772 


22627 


RTA000028S7Rk.07. 1 .P.Seq 


F 


M00001410A:G10 


CHOICOH 


773 


24430 


RTA00002901Rh.20. LP.Seq 


F 


M00005520B;E0I 


. CH02COH 


774 


16151 


RTA0OO02897F.I.22. 1 P.Seq 


F 


M000042S4A:F08 


CHOICOH 


775 


6148 


RTA00002S90RU 6. LP.Seq 


F 


M0000162JDr£l2 


CHOICOH 


776 


106064 


RTA00002908RU 9. LP.Seq 


F 


M000224S5B:E07 


CH03MAH 


777 


9573 


RTA00002S93Rp. 13. LP.Seq 


F 


M00003970D-H07 


CHOICOH 


778 


19542 


RTA00002902FJ.20. 1 .P.Seq 


F 


M00006756B:G06 


CH02COH 


779 


16672 


RTA00002SS9F.b.2 1 . 1 .P.Seq 


F 


M0000152SC;C03 


CHOICOH 


780 


8573 


RTA00002S9 1 F.p.07. LP.Seq 


F 


M000037S5D:F07 


CHOICOH 


781 


15746 


RTA00002896Rh.lO. LP.Seq 


F 


M00004165C:A03 


CHOICOH ! 


782 


4500 


RTA000028S7F.b.OS. LP.Seq 


F 


M000013S"A:C12 


CHOICOH 


783 


16003 


RTA00O029 lORc.08. 1 .P.Seq 


F 


M00022820A:F07 


CH03MAH 


784 


18723 


RTA000029 1 6F.g. 1 8. LRSeq 


F 


M000325S0D:A09 


CH08LNH 


785 


4270 


RTA00OO2922Rb.0L LP.Seq 


F 


M0003S616C:C09 


CH09LNL 


786 


30095 


RT A00002907Ri.20. 1 .P.Seq 


F 


M0002220SC:E04 


CH03M.AH 


787 


42916 


RTA00002924F.C.08. LP.Seq 


F 


M00O39433B:DO6 


CH09LNL 


788 


13652 


RTA00002902FJ.09. LP.Seq 


F 


M000067t4C:D06 


CH02COH 


789 


6972 


RTA00002902F.|06. LP.Seq 


F 


M000067i:C:H01 


CH02COH 


790 


4519 


RTA000029 lORi.06, 1 .P.Seq 


F 


M0002294~B:D02 


CH03MAH 


791 


13106 


RTA0O0O:928F.f.09. 1 .P.Seq 


F 


. M0004022^C:F06 


CH13EDT 


792 


98186 


RTA0OOO:909Rm,O8. 1 .P.Seq 


F 


M00022696B;CL1 


CH03MAH 


793 


3167 


RTA0O0O:S9SRg.09. LP.Seq 


F 


M0000434^D:C12 


CHOICOH 


794 


3272 


RTA0OOO2S97Ra. IS. LP.Seq 


F 


MO00O42i:D:CO3 


CHOICOH 


795 


14446 


RTAOO0O2S99F.d.05. LP.Seq 


F 


M0000446:D;D12 


CHOICOH 


796 


17865 


RTA00OO:9l8F.a.l3.LP.Scq 


F 


M0003:825B:F08 


CH08LNH 


797 


5834 


RTAO0OO:S98Rh. 12. LP.Seq 


F 


M0000435:a:DOS 


CHOICOH 


798 


14533 


RTA00002S96Rk.24. LP.Seq 


F 


M0000417?C:B06 


CHOICOH 


799 


15222 


T*TAO00O29O0F.j.05. 1 .P.Seq 


F 


M0000533ZA.C06 


CH02COH 


800 


22594 


RTAO0O0:S9SRh.2L LP.Seq 


F 


M0000435~3:B06 


CHOICOH 
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LIBRARY 


801 


9204 


RTA0O0Q2S90F.H.20. l.P.Seq 


F 


M0O0O16I9CH09 


CHOICOH 


802 


186464 


RTA0000291 lF.d.09.2.P.Seq 


F 


M00026842D:C02 


CH04MAL ! 


803 


5441 


RTA00OO290OF.a. 1 LLP.Seq 


F 


M00004824D;H05 


CH02COH 


804 


32544 


RTA00002393F.1.2 1, l.P.Seq 


F 


M00003935B;BOl 


CHOICOH 


805 


15351 


RTA000029 15F.j. 15, l.P.Seq 


F 


M00032471D:A05 


CH08LNH 


806 


13129 


RTA00002S98F.a. 12. 1 .P.Seq 


F 


M00004310B;E02 


CHOICOH 


807 


186376 


RTA000029 l2Fk.2 1 . 1 .P.Seq 


F 


M00027485C:F07 


CH04MAL 


808 


17816 


RT A0000290 1 F.o.04. 1 .P.Seq 


F 


M00005674C;F04 


CH02COH 


809 


8434 


RTA00O02923F.I.22. 1 .P.Seq 


F 


M00039326C:B08 


CH09LNL 


810 


22146 


RTA00002922F.L08. 1 .P.Seq 


F 


M00039067B:F07 


CH09LNL 


811 


3L912 


RTA00002904F.a. 14. l.P.Seq 


F 


M00007154A:E06 


CH02COH 


812 


1487 


RTA00002925F.n.03. l.P.Seq 


F 


M00040016CE07 


CH09LNL 


813 


24777 . 


RTA0O00290OF.n.O2. l.P.Seq 


F 


M00005380B:H10 


CH02COH 


814 


144483 


RTA00002902F.d.0 1 , 1 -P.Seq 


F 


M00006577A:H10 


CH02COH 


815 


6546 


RTA00002935F.p. 16. LP.Seq 


F 


M0O055425C;AO4 


CH17COHLV 


816 


5984 


RTA00O02935F.p.09. LP.Seq 


F 


M00055420A:E06 


CH17COHLV 


817 


24441 


RTA00002900F.a.22* 1 .P.Seq 


F 


M00004832D:G04 


CH02COH 


818 


20889 


RTA00002935F.h.09 i 1 .P.Seq 


F 


M00054807D.C11 


CH17COHLV 


819 


127721 


RTA0OOO29l5F,c. 18. l.P.Seq 


F 


M00028763A:Gll 


CH08LNH 


820 


20684 


RTA00002900F.C.03. LP.Seq 


F 


M00004843A:G12 


CH02COH 


821 


30095 


RTA00002907F.L20.2.P.Seq 


F 


M0OO2220SC:EO4 


CH03MAH 


822 


6763 


RTA00002892F.o.01.2.P.Seq 


F 


M0OO03845D:GO3 


CHOICOH 


823 


6763 


RTA0OO02892F.n.24.2.P.Seq 


F 


M000O3845D:GO3 


CHOICOH 


824 


48725 


RTA00002907E!.22-2.P,Seq 


F 


M00022240B;C12 


CH03MAH 


825 


21260 


RTA00002935F.C.22. l.P.Seq 


F 


M00054499A:COS 


CH17C0HLV 


826 


42572 


RT A00002930Rc .2 1 . 1 .P.Seq 


F 


M00055454A:D02 


CH15CON 


827 


3441 


RTA00002935F .i. 13. l.P.Seq 


F 


MOO054S9OC:DO5 


CH17COHLV 


828 


21419 


RTA00002930F.b. 13. 1 .P.Seq 


F 


M00042734A:F05 


CH15C0N 


829 


8004 


RTA000029 iOF.b.08. 1 .P.Seq 


F 


M00022S053:A10 


CH03MAH 


830 


185870 


RTA00002912F.C.06. l.P.Seq 


F 


M00027247C:D02 


CH04MAL 


831 


24580 


RTA00002930F.d.0L 1 P.Seq^ 


F 


M00055466A:F06 


CH15C0N 


832 


5153 


RTA0OO02930F.b.l6, LP.Seq 


F 


M00042743D:GIO 


CH15C0N 


833 


8653 


RT A0OO02895F.f. 1 7, l.P.Seq 


F 


M000040S0C:C04 


CHOICOH 


834 


23799 


RTA00002924F.I.23. 1 .P.Seq 


F 


I M0003969SC;B03 


CH09LNL 


835 


11012 


RTA00002930F.J.09 J .P.Seq 


F 


M00056215D;F02 


CH15C0N 


836 


46592 


RT A00002900F.b. 19. 1 .P.Seq 


F 


M00004839B:C12 


CH02COH 


837 


6650 


RTA00002908F.m. 12. 1 .P.Seq 


F 


M00022491D:A10 


CH03MAH 


838 


16618 


RTA00002889F. n. 18.1 .P.Seq 


F 


MOOOO156SCA03 


CHOICOH 


839 


18274 


RTA00002889F.g.05. l.P.Seq 


F 


M00001543C:A0S 


CHOICOH 


840 


20694 


RT A0000290SF h.08. 1 .P.Seq 


F 


M00022442B:G03 


CH03MAH 


841 


9493 


RTA00002909F.m. 1 1 . 1 .P.Seq 


F 


M0002269SC:DIO 


CH03MAH 


842 


6132 


RT A00002897F.C.04. 1 .P.Seq 


F 


M00004220D:CH 


CHOICOH 


843 


186259 


RT A000029 l2F.rn.13. 1 .P.Seq 


F 


M000275:7B:C05 


CH04MAL 


844 


3769 


RT A000029 1 6F.g. 22. l.P.Seq 


F 


M000325SIB:A09 


CHOSLNH 


845 


36584 


RTA00002935F.f. 12. l.P.Seq 


F 


M000546S ? 3D:Gll 


CH17C0HLV 


846 


38077 


RTA00002890F.e.06. l.P.Seq 


F 


MOOOO 16053:305 


CHOICOH 


847 


3927 


RTA00002935F.a. 12. 1 .P.Seq 


F 


M000425l6B:D01 


CH17C0HLV 


848 


4275 


RT A000029 1 4F.b. 16. 1 .P.Seq 


F 


MOO02S0t>3C:HO! 


CHOSLNH 


849 


12554 


RTA0000292 lF.a.23. l.P.Seq 


F 


M00033302A:Ell 


CH09LNL 


850 


13761 


RTA0000290lF.f22. l.P.Seq 


F 


M00005489B;COS 


CH02COH 



lis 
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CLUSTER 


SEQ NAME 


1 ORIENTATION 


CLONE ID 


LIBRARY 


851 


19059 


RTA00002897F.e.22. 1 .RSeq 


F 


M0O0O4237CD10 


CHOlCOH 


852 


22944 


RTA00002935Rb. l7.LP.Seq 


F 


M0OO43355A:D07 


CH17C0HLV 


853 


| 2189 


RT A00002925 F. j .06. 1 .RSeq 


F 


M0003992IA:BIO 


CH09LNL 


854 


i 19153 


RTA00002392F.h.04.2.P.Seq 


F 


M00003819B;B01 


CHOlCOH 


855 


1833 


RTA00002S90Fe, 1 3. 1 .P.Seq 


F 


M00001606B:AIO 


CHOlCOH 


856 


18447 


RTA00O02935Fd,23- l.P.Seq 


F 


M0O054569A:B07 


CH17COHLV 


857 


2461 


RTA0OO02922F.b08. 1 .P.Seq 


F 


M00038619B:F09 


CH09LNL 


858 


15917 


RT A00002896F j.06. 1 P.Seq 


F 


M00004172CA08 


CHOlCOH 


859 


9379 


RT A00002935Ra. 15. 1 .P.Seq 


F 


M00043299A:B10 


CH17C0HLV 


860 


5511 


RTA0000293 1 Rb.06. 1 RSeq 


F 


M00042796A;A10 


CH16C0P 


861 


10540 


RTA0000289 IRk. 16. l.P.Seq 


F 


M00003764B:H11 


CHOlCOH 


862 


12117 


RT A00002899F.a.09. 1 .P.Seq 


F 


M00004419A:G02 


CHOlCOH 


863 


8777 


RTA000029 L9F.a.23. 1 .P.Seq 


F 


M00033028D:C10 


CH08LNH 


864 


23972 


RTA00002900F.0. 18. l.P.Seq 


F 


M00005403C;A01 


CH02COH 


865 


17005 
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CH01COH 


1024 


12473 


RTA0000288SF.O.07. l.P.Seq 


F 


M0000149:C;F10 


CH01COH 


1025 


15840 


RT A000029 l9F.n. 1 7. LP.Seq 


F 


M00033230OG10 


CHOSLNH 


1026 


6554 


RT A00002895F.b. 16. 1 P.Seq 


F 


M00004062D:A02 


CH01COH 


1027 


7330 


RTA000029lSF.n.l7.1.P.Seq 


F 


M000329S7B:F0l 


CHOSLNH 


1028 


2206 


RTA000029 19F.f. 12. 1 .P.Seq 


F 


M00033071C:G05 


CHOSLNH 


1029 


42705 


RTA00002935F.f.02.1.P.Seq 


F 


M00054643D:F07 


CH17C0HLV 


1030 


33865 


RTA00002930F,b,07. 1 .P.Seq 


F 


M00042722C:C09 


CH15C0N j 


1031 


5196 


RTA00002925F.o.l9.1.P.Seq 


F 


M00040054B:G02 


CH09LNL 1 


1032 


8087 


RTA00002935Fal0.1.P,Seq 


F 


M00055375C:F12 


CH17COHLV 


1033 


20072 


RTA00002935F.jz.06. 1. P.Seq 


F 


M00054744CF12 


CH17COHLV 


1034 


12797 


RTA000O2935F.g.24.l.P.Seq 


F 


M00054781D:AU 


CHI7COHLV 


1035 


3207 


RTA00002930F.C.04. 1 .P.Seq 


F 


M00054793B:A06 


CH15CON 


1036 


19600 


RT A00002929F. f,24. 1 P.Seq 


F 


M0O04035lD:GO7 


CH14EDT 


1037 


6278 


RTA00002935FJ. 18. l.P.Seq 


F 


M00055001C:GIO 


CH17C0HLV 


1038 


19363 


RTA00002927F.e. 12. LP.Seq 


F 


M00039564D:D04 


CH12EDT 


1039 


15447 


RTA00002929F.2. 12. 1 P.Seq 


F 


M00040366B:HIO 


CH14EDT 


1040 


9676 


RTA00002932F.a, 14. 1 .P.Seq 


F 


M0004296OB:C06 


CH18C0N 


1041 


12560 


RT A00002929F.h.06. 1 .P.Seq 


F 


M000403S1A:B06 


CH14EDT 


1042 


12727 


RTA00002933F.lv 15. l.P.Seq 


F 


M00043219CC02 


CH19C0P 


1043 


27475 


RT A000029 UF.c. 1 6. 1 .P.Seq 


F 


M0002S070D:C03 


CHOSLNH 


1044 


30646 


RTA0000290SF.ML l.P.Seq 


F 


M000224l6D:D0t 


CH03MAH 


1045 


455S5 


RT A00002925F.h.20. 1 .P.Seq 


F 


M00039S94OD09 


CH09LNL 


1046 


25025 


RTA00002925F.e. 1 8. 1 .P.Seq 


F 


M00039S60B:EOl 


CH09LNL 


1047 


15715 


RTA000029 1 9F.p,05. 1 .P.Seq 


F 


M00033274D:F03 


CHOSLNH 


1048 


381S5 


RTA00002926F.c.07.2.P.Seq 


F 


M0004007SA:C07 


CH09LNL 


1049 


83S4 


RT A00002903 F.o. 1 3, 1 .P.Seq 


F 


M000071I2D:D03 


CH02COH ! 


1050 


8S43 


RTAO0OO29l7F.h.l7.LP.Seq 


F 


M00032733B:F12 


CHOSLNH 
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CLONE CD 


LIBRARY 


105 i 


21401 


RTA0OOO2930F.e,20.2.P.Seq 


F 


M00055676A:G02 


1 CH15CON 


1052 


14434 


RTA00002903F.1 .0 1 . 1 .P.Seq 


F 


M00007032A:B05 


CH02COH 


1053 


40045 j 


RTA0OOO289 IF.k, 14. l.P.Seq 


F 


M00003764A:H09 


CH01COH 


1054 


21853 


RTA00002896F. a. 12. l.P.Seq 


F 


M00004I59C:D10 


CH01COH 


1055 


23439 


RTA00002935F.p,01. 1 .P.Seq 


F 


M00055402A:H01 


CH17COHLV 


1056 


13060 


RTA0OOO2934F,a. i9.LP.Seq 


F 


M00043529A:B08 


CH20COHLV 


1057 


23439 


RTA00002935F.O.24. l.P.Seq 


F 


M00055402A:H0l 


CH17COHLV 


1058 


20547 


RTA0000293 IF.b. 12. l.P.Seq 


F 


- M00042822A.-HO4 


CH16COP 


1059 


4319 


RTA00002930F.a.03. 1 .P.Seq 


F 


M00042525B:HOl 


CH15CON 


1060 


21430 


RTA0000290 1 Fx . 1 8. 1 .P.Seq 


F 


M000O5452BtGO3 


CH02COH 


1061 


7668 


RTA0O0O2935F.g.23. 1 .P.Seq 


F 


M000547SIB:H04 


CH17COHLV 


1062 


16239 


RTA0OOO2935F.k. 19. 1 .P.Seq 


F 


M0005508IA;A05 


CH17COHLV 


1063 


5631 


RTA00002929F,e. 1 6. 1 .P.Seq 


F 


M00040326B:G09 


CH14EDT 


1064 


18362 


RTA0000292SF.a.04 i 1 P.Seq 


F 


M00039739B:H12 


CH13EDT i 


1065 


8034 


RTA00002932F.a.22. 1 .P.Seq 


F 


M0C042982D:AiO 


CH18CON 


1066 


12497 


RTA0000292SF.a. 19. 1 .P.Seq 


F 


M00040132A:H09 


CH13EDT 


1067 


21001 


RTA00002932F.b.07. l.P.Seq 


F 


M00042996B:H08 


CH18CON 


106S 


471 


RTA00002927F.a. 1 1. l.P.Seq 


F 


M00039184D;H09 


CH12EDT 


1069 


10003 


RTA00002S97F.b. 13.1 .P.Seq 


F 


M00004215B:CO5 


CH01COH 


1070 


16074 


RTA00002935F.f. 18. l.P.Seq 


F 


M0005470SC:B06 


CH17COHLV 


107 L 


13698 


RT A0O0O29O2F. LO 1 . 1 .P.Seq 


F 


M00006743A:H11 


CH02COH 


1072 


24819 


RTAOOOO2922F.j.03. l.P.Seq 


F 


MOOO39O7SB;B03 


CH09LNL 


1073 


21511 


RTA0OO02892F.i.0 1 . 1 .P.Seq 


F 


M00003S21C:E12 


CH01COH 


1074 


12402 


RTA00002929F.d. 15. l.P.Seq 


F 


M000403UB:D07 


CH14EDT 


1075 


142755 


RTA000O2903F.p.06.1.P.Seq 


F 


M00007126A;A02 


CH02COH 


1076 


3010 


RTA00002935F.p. 12. l.P.Seq 


F 


M00055423C:G12 


CH17COHLV 


1077 


17173 


RTA00002935F.k.09. l.P.Seq 


F 


M00O55O43B:HO8 


CH17COHLV 


1078 


2969 


RTA00002933F.a. 12. l.P.Seq 


F 


M00043076D:A02 


CH19COP 


1079 


19600 


RT A00002929F.g.0 1 . 1 .P.Seq 


F 


MC0O4035lD:GO7 


CH14EDT 


1080 


8542 


RTA00002927F.H.24. l.P.Seq 


F 


M0003964*A:A02 


CH12EDT 


1081 


24795 


RTA00002927F.t". 10. 1 P.Seq 


F 


M00039594C:B06 


CH12EDT 


1082 


19695 


RTA0O002927F,i.03. 1 P.Seq 


F 


M0003964"B:A02 


CH12EDT 


1083 


8542 


RTA0O0O2927F.L01. l.P.Seq 


F 


M0003964"A:A02 


CH12EDT 


1084 


21409 


RTA00002902F.C.09. 1 ,P,Seq 


F 


M00006601D:G05 


CH02COH 


1085 


186318 


RTA0OOO2912F.h.07. l.P.Seq 


F 


M00027363D:G04 


CH04MAL 


1086 


7379 


RTA00002901F.e. 10. l.P.Seq 


F 


M0000546SDtC0l 


CH02COH 


1087 


91285 


RT A00002909F.L 12. 1 .P.Seq 


F 


M000226S2D:A10 


CH03MAH 


1088 


3285 


RTA00002903F.rn.18. l.P.Seq 


F 


M000070C?4D:D12 


CH02COH 


1089 


6284 


RTA00002S96F.S.22. l.P.Seq 


F 


M00004160D:G05 


CHOICOH 


1090 


15676 


RT A000O2935F.h.2 1 . 1 .P.Seq 


F 


M00054S43A:C0l 


CH17COHLV 


1091 


34112 


RTA00002S94F.0. 19. 1 .P.Seq 


F 1 


MOO0O4053D;F09 


CHOICOH 


1092 


16407 


RTA00002S92F.o,24,2.P.Seq 


F 


M00003S5iB:A01 


CHOICOH 


1093 


919 


RTA00002S90F. f. 1 8. 1 .P.Seq 


F 


M00001609D:CU 


CHOICOH 


1094 


59069 


RTA00002S96F.k.07, L.RScq 


F 


M00004I75D:E06 


CHOICOH 


1095 


31167 


RTA00002900F.f.05, l.P.Seq 


F 


M00004S7oB:A06 


CH02COH 


1096 


23873 


RTA00002930F.L05. 1 .P.Seq 


F 


M00056035D:A08 


CH15CON 


1097 


15679 


RTA00002900F.h. 12. l.P.Seq 


F 


M000050lcC:E04 


CH02COH 


109S 


31852 


RTA000029 1 lF.o.22. l.P.Seq 


F 


M00027rOD;C07 


CH04MAL_ 


1099 


39030 


RTA00002 S9 1 F. f.07 . 1 .P.Seq 


F 


M00001692CC04 


CHOICOH 


1100 


16407 


RTA0O0O2S92F.p.0i.2.P.Seq 


F 


MO00O3S5;B:AOl 


CHOICOH 



no 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE DD 


LIBRARY 


U01 


4118 


RTA00002935F.a. 17. LP.Seq 


F 


M00043306D:C01 


CH17C0HLV 


U02 


11054 


RTA000029 l5F.n.07.2.PSeq 


F 


M00032504B:B10 


CH08LNH 


1103 


186640 


RTA000029 1 8F.d.03. LP.Seq 


F 


M0003284SD:B10 


CH08LNH 


i 104 


9301 


RTA00002887F L22. LP.Seq 


F 


M0000I396D:H02 


CHOICOH 


1105 


13438 


RTA0000290 1 F.a. 1 1. LP.Seq 


F 


MO0OO5422D:H10 


CH02COH 


H06 


23691 


RTA0000290lF.g.2L LP.Seq 


F 


M000O5506CE09 


CH02COH 


1107 


32788 


RTA00002909F.m.04. 1 .P.Seq 


F 


M00022694A:F05 


CH03MAH 


1108 


34364 


RTA00002915F.o.09,2.P.Seq 


F 


M00032515A:B12 


CH08LNH 


1109 


24840 


RTA00002908F.i.0 1 . 1 .P.Seq 


F 


M00022452B.E06 


CH03MAH 


1110 


3416 


RT A000029 1 IF.j. 1 7. 1 .P.Seq^ 


F 


M00027036A;B06 


CH04MAL 


nil 


16889 


RTA0OOO2930F.f. 04. LP.Seq 


F 


M00055724B;E04 


CH15C0N 


1U2 


2159 


RTA00002929Ff. 15, LP.Seq 


F 


M00040344C:D05 


CH14EDT 


1113 


8880 


RTA00002929F.L02. LP.Seq 


F 


M00040338A;B10 


CH14EDT 


1114 


10722 


RTA00002921F.O.22. LP.Seq 


F 


M00038304B:E02 


CH09LNL 


1115 


15046 


RTA00002887F.C.08. LP.Seq 


F 


M00001389B:E10 


CHOICOH 


1116 


13868 


RTAO0002898Re.l8. LP,Seq 


F 


M00004347B:E04 


CHOICOH 


1117 


4226 


RTA00002925F.f. l5.LP.Seq 


F 


M00039869A:H01 


CH09LNL 


1118 


90435 


RTA00002909F.L06. 1 P.Seq 


F 


M00022678B:C08 


CH03MAH 


1 119 


25686 


RTA00002900F.I.02. 1 .P.Seq 


F 


M00005359B:D09 


CH02COH 


1120 


7296 


RTA00002900F.d.08. 1 .P.Seq 


F 


M00004S56O:F09 


CH02COH 


1121 


11546 


RTA00002905F.0. 16. 1 ,P.Seq 


F 


M0002167SA:H03 


CH03MAH 


1122 


15748 


RTA00002901F.K. 12. LP.Seq^ 


F 


M000O5515D:F02 


CH02COH 


1123 


5591 


RTA00002903F.p.20. 1 P.Seq 


F 


M00007U1C:B05 


CH02COH 


1124 


9433 


RTA00002935F,p. 14. 1 .P.Seq 


F 


M00055424B:H06 


CH17COKLV 


1125 


9654 


RTA00002924F,i-09. LP.Seq 


F 


M00039654C:Cli 


CH09LNL 


1126 


21914 


RTA00002SS9F.i.06. 1 P.Seq 


F 


M00OOl55OA:H06 


CHOICOH 


U27 


4277 


RTA00002927F.h. 13. LP.Seq 


F 


M00039642A:A08 


CH12EDT 


1128 


12362 


RTA00002929F.h. 24. LP.Seq 


F 


M00040391A:G05 


CH14EDT 


1129 


449 


RTA00002924F.e22. 1 P.Sea 


F 


M0003947LD:G10 


CH09LNL 


1130 


1820 


RTA00002922F,n, 1 LLPSeq 


F 


M00039133C:F12 


CH09LNL 


1131 


12159 


RT A00002930F.b-24. 1 .P.Seq 


F 


M00042894C:A11 


CH15C0N j 


1132 


25106 


RTA00002903F.d.2 1 . 1 ,P.Seq_ 


F 


M00006907B:C06 


CH02COH 


1133 


2245 


RTA000029 1 7F.b.20. 1 .P.Seq 


F 


M00032676CC10 


CH08LNH 


1134 


14388 


RTA00002894F,hm 1 .P.Seq 


F 


M0000399SB:G10 


CHOICOH 


1135 


12219 


RTA00002898F.d.22. 1 .P.Seq 


F 


M00004328A:D01 


CHOICOH 


1136 


4726 


RTA00002935F.d. 16.1 P.Seq 


F 


M0005453SD;C12 


CH17COHLV 


1137 


19479 


RTA0000289 LF.e. 15.1 P.Seq 


F 


MOOOOl68SB:Btl 


CHOICOH 


1138 


13280 


RTA000028S8F.h.08. LP.Seq 


F 


M00001465CA02 


CHOICOH ! 


1139 


42708 


RTA00002901F.h.07, LP.Seq 


F 


M0OOO55UA:F05 


CH02COH 


1140 


2022 


RT A00002896F. I\09, 1 P.Seq 


F 


M000041S5CAIO 


CHOICOH 


1141 


7281 


RTA0OOO2929F.f.22. 1 P,Seq 


F 


M0O04035lA:C0S 


CH14EDT 


1142 


3241 


RTA000029 19F.H.09. 1 .P.Seq 


F 


M00033223C:G04 


CH08LNH 


1143 


16161 


RTA00002930F.h.02. 1 .P.Seq 


F 


M00055925D;B07 


CHI SCON 


1144 


2766 


RTA00002935F.p.2 L • LP.Seq 


F 


M0O05547~D:B01 


CH17COHLV 


1145 


11175 


RTA00002SS6F.b.05. LP.Seq 


F 


M0OOOl34OD;F07 


CHOICOH 


1146 


7223 


RTA00002923F.L06. LP.Seq 


F 


M0003927SC:D03 


CH09LNL 


1147 


6786 


RTA00002917F.O.05- LP.Seq 


F 


M00032792C;B01 


CHOSLNH 


1148 


186651 


RTA00O02923F.d.2 L LP.Seq 


F 


M00039219B:COS 


CH09LNL 


1149 


7878 


RTAO0OO2930F1.05. 1 P.Seq 


F 


M00055724D;C07 


CH15C0N 


1150 


12624 


RTA00002935F.p. 1 1 . LP.Seq 


F 


M000554::A:BOS 


CH17C0HLV 
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CLONE ID 


LIBRARY 


1151 


23018 


RTA00002836Rf.2L LP.Seq 


F 


MOOOOl35iB:Ell 


CHOICOH 


1152 


186756 


RTA0000291 lF.g.23. LP.Seq 


F 


M0002696!A:B06 


CH04MAL 


1153 


4177 


RTAO0OO2902Rh.07. 1 .P.Seq 


F 


M00006673A:A03 


CH02COH 


L154 


10430 


RT A00O02894F.g.2 1 . 1 .P.Seq 


F 


M00003996B:H07 


CHOICOH 


1155 


31280 


RTAOOOO2903F.k.O3. 1. P.Seq 


F 


M0O0070O"A:E04 


CH02COH 


1156 


19098 


RTA00002925F.e.23. LP.Seq 


F 


M00039861CB12 


CH09LNL 


1157 


24105 


RTAO0O02932Ra.06. 1 P.Se<L 


F 


M00042535D:EIO 


CH18C0N 


1158 


7750 


RTA0OOO2935Ri.02. i P.Seq 


F 


M00054856C.D03 


CH17C0HLV 


1159 


14582 


RTA00002898Rd.07, LP.Seq 


F 


M000O432-lA:D05 


CHOICOH 


1160 


21356 


RTA00002917F j, 19. 1 .P.Seq 


F 


S M00032753A:C07 


CH08LNH 


1161 


16210 


RTA00002930F.k. 17. LP.Seq 


F 


M00056345D.A04 


CH15C0N 


1162 


2012 


RTA0000291 IF.a 10. LP.Seq 


F 


M00027159C:F07 


CH04MAL 


1163 


5391 


RTA00O02909Rp,2L LP.Seq 


F 


M0OO22738D:G08 


CH03MAH 


1164 


10172 


RTA00OO2886Ra.05. LP.Seq 


F 


M0OOO1333CB02 


CHOICOH 


1165 


16403 


RTA00002935F.p. 15. LP.Seq 


F 


M0O05542-iD:G05 


CH17C0HLV 


1166 


21920 


RTA00002886F ,j,05. LP.Seq 


F 


M0000136IA:C12 


CHOICOH 


1167 


7070 


RTA0000292lF.e.06. LP.Seq 


F 


M00033374D:C07 


CH09LNL 


1168 


45734 


RTAO0OO290lF.j. 14. LP.Seq 


F 


M00005569D:G09 


CH02COH 


1169 


12362 


RTAOOOO2929F.L0L LP.Seq 


F 


M0OO40391A:G05 


CH14EDT 


1170 


9405 


RT A00002892F.k.04. 1 P.Seq 


F 


M00003S30C:D02 


CHOICOH 


1171 


6507 


RTA00002922F.O.05. 1 .P.Seq 


F 


M00039UOA.F05 I 


CH09LNL 


1172 


10735 


RTA00002925F.b.24. 1 P.Seq 


F 


M00039S22A.H02 


CH09LNL 


U73 


21177 


RTA00002935Rd. 1 8. LP.Seq 


F 


M00054542B:A10 


CH17C0HLV 


1174 


14950 


RTA00002894F.m. 18. 1 .P.Seq 


F 


M0000404 J TD:F12 


CHOICOH 


1175 


10762 


RTA00002917F.O.08. LP.Seq 


F 


M00032793A:G06 


CH08LNH ! 


1176 


23170 


RTA000028S7Rf. 15. LP.Seq 


F 


M00001396B:B01 


CHOICOH 


1177 


8487 


RTA00002887RL 16. LP.Seq 


F 


M0O001396B:B12 


CHOICOH 


1178 


185798 


RT A000029 1 lF.k.06. 1 .P.Seq 


F 


M00027050A:B02 


CH04MAL 


1179 


8976 


RTA00002S96Rh.03. LP.Seq 


F 


MQ0004t6IB:G07 


CHOICOH 


uso 


12159 


RTA00002930RC.OL LP.Seq 


F 


M00042S9-iC:All 


CH15C0N 


1181 


7788 


RTA00002932Rb.l3.LRSeq 


F 


M0004301"C:D08 


CHI SCON 1 


1182 


43336 


RT A000029 17RcL09. 1 .P.Seq 


F 


M000326SSC:A03 


CH08LNH ! 


1183 


10313 


RTA00002902Rk. 19. 1 .P.Seq 


F 


M00006740B:A09 


CH02COH 


1184 


. 4588 


RTA00002S91F.0. 1 L LP.Seq 


F 


M000O37S2A:B02 


CHOICOH 


1185 


18090 


RTA00002925RI. 17. LP.Seq 


F 


M000399SID:B01 


CH09LNL 


1186 


185994 


RTA000029 1 lRp.07. 1 P.Seq 


F 


M00027L77B:D04 


CH04MAL 


1187 


166276 


RTA000O29O8Rh. 03 T, P.Seq 


F 


M000224JSC:H09 


CH03MAH 


1188 


15984 


RTA00002932Ra. 10. LP.Seq 


F 


M00042621C:C04 


CHI SCON 


1189 


13242 


RTA00002889Ri.lL LP.Seq 


F 


MOOOOl550D:Btl 


CHOICOH 


1190 


6840 


RTA00002935F.i. 06. LP.Seq 


F 


M00054Sd6B:C08 


CHL7C0HLV 


1191 


17265 


RTA00002935R1.04, 1 P.Seq 


F 


M0O055lOSB:AO2 


CH17C0HLV 


1192 


12542 


RTA00002933F.b.l7. LP.Seq 


F 


M00043152C;B10 


CH19C0P 


1193 


1568 


RT A00002928F.d. 10. LP.Seq 


F 


M00040r4D:G06 


CH13EDT 


1194 


8721 


RTA0000290lRk.23. LP.Seq 


F 


M00005606D:B12 


CH02COH 


1195 


13519 


RTA00002898F.J, 19. LP.Seq 


F 


M00004;-bSA:Bll 


CHOICOH 


1196 


4471 


RTA00002890F.d. 14. LP.Seq 


F 


M00001600B;G01 


CHOICOH 


1197 


11357 
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F 


M0003910^A:E12 


PHftOr VT 


1354 


25441 


RTAOO0O29O6F.L08. LP.Seq 


F 


M000219SLAC02 


CH03MAH 


1355 


4303 


RT A00002897F.O.20. 1 .P.Seq 


F 


M00004' > 9' ; D*C07 




1356 


5741 


RTA00002387F.C. 19. LP Seq 


F 


MOOOOr^ODEO 7 


CHOICOH 


1357 


17264 


RTAOO0029OOF.a. 18. 1. P.Seq 


F 


M00004831C:G11 


CH02COH 


1358 


11766 


RTA00002925Rf. 20. 1. P.Seq 


F 


M00039S71C;G05 


CH09LNL 


1359 


13618 


RTA00002S93F.0. 15. LP.Seq 


F 


M00003963D;F01 


CHOICOH 


1360 


13903 


RT A00002923Rc. 1 8. 1 .P.Seq 


F 


M00039^04VE09 


CH09LNL 


1361 


10673 


RTA00002927Rh.23. LP.Seq 


F 


1 M000396^6A:E06 


CH12EDT 


1362 


17412 


RTA00002932Rb.l 1. 1. P.Seq 


F 


M000430MDD05 


CH18CON 


1363 


2218 


RT A00O029 1 9F.a,20. 1 .P.Seq 


F 


M00033028C:A02 


CH08LNH 


1364 


1 5858 


RTA00002923F. i.0 1 . 1 .P.Seq 


F 


M00039275B:E02 


CH09L^L 


1365 


2510 


RTA00002898F.b. 14. LP.Seq 


F 


! M00004316A B03 


! CHOICOH 


1366 


8050 


RTA00002900F.n.04. LP.Seq 


F 


M000O53S3A:Cll 


CHOICOH 


1367 


186538 


RTA00002929F.e. 1 8. 1 .P.Seq 


F 


M0OO40329A:HO5 


CH14FDT 


1368 


25427 


RTA0O0O2935F.n.20, 1 .P.Seq 


F 


M000553T B C04 


CH17COHLV 


1369 


24098 


RTA0000290lRa. 10. LP.Seq 


F 


M0O0O54^DH0^ 


CHOICOH 


1370 


123823 


RTA0OO029O5F.h-0S.LP.Sec] 


F 


M0OOO8O7lD:H03 


CH03MAH 


1371 


3644 


RTA0000290 lF.c,03, LP.Seq 


F 


M000054-1-DD04 




1372 


27783 


RT A000029 1 7Ra, 17.1 .P.Seq 


F 


M0003' ? 666A C0° 


CHOXI \*H 

V> 1 1 VOL.* 1 1 A 


1373 


1682 


RTA000029 tOF.b.03. 1 .P.Seq 


F 


MOCKP^SO ' D D09 


CH03MAH 


1374 


3200 


RTA000028S7F.e.07, 1 .P.Seq 


F 


M00001395C:F04 


CHOICOH 


1375 


8442 


RTA000029 1 7F.h.23. LP.Seq 


F 


M0003273~B:E12 


CHOSLNH 


1376 


15353 


RTA00002910Re. 1 1. LP.Seq 


F 


M00022S54C'G07 


CH03MAH 


1377 


6314 


RTA00002922Rb.06. 1 .P.Seq 


F 


M00O386lSD:D08 


CH09LNL 


1378 


93549 


RTA0OO02909F.j. 14. LP.Seq 


F 


M00022662C:H04 


CH03M.AH 


1379 


15496 


RTA00002906Rp.03. LP.Seq 


F 


M000220SSB.H02 


CH03MAH 


1380 


16572 


RTA00O02S86Rk.03. 1 -P.Seq 


F 


M0000136-A:C09 


CHOICOH 


1381 


74821 


RT A00002S90F.p.2 1 . 1 .P.Seq 


F 


M00001663A.A12 


CHOICOH 


1382 


11315 


RTA0OOO'2889F.d. 12. LP.Seq 


F 


M000015353:BIO 


CHOICOH 


1383 


10859 


RTA00002S94Rc. 1 S. 1 .P.Seq 


F 


M000039SOD:C06 


CHOICOH 

A A \J A \— \_r i As 


1384 


15391 


RTA000029 14F.f.04. LP.Seq 


F 


M00028193B:E07 


CH08LNH 


1385 


23172 


RTA00002896F.b. 18. LP.Seq 


F 


M0000414lB;F08 


CHOICOH 


1386 


22510 


RTA00002S86F.!.05.1.P.Seq 


F 


M0000136SA:C02 


CHOICOH 


1387 


17156 


RTA00002934F.a.OS. 1 .P.Seq 


F 


M000434553:C08 


CH20COHLV 


1388 


4593 


RTA00002S96F.0. 1 S. 1 .P.Seq 


F 


M0000420CC:A04 


CHOICOH 


1389 


2178 


RTA0000290I F.m.OS. 1 .P.Seq 


F 


M00O0562cD:GlL 


CH02COH 


1390 


1015 


RTA00002933Rc.il. LP.Seq 


F 


M00043213A:D05 


CH19COP 


1391 


26792 


RTA00002907Ra, 1 8. 1 .P.Seq 


F 


M00022103C:D05 


CH03M.AH 


1392 


27830 


RTA0O0O2921F.C.07. LP.Seq 


F 


M00O3334-iA:B06 


CH09LNL 


1393 


14648 


RTA0O0O2S98F,j. 1 L LP.Seq 


F 


M0000436fC:GU 


CHOICOH 


1394 


12585 


RTA0OO02S97F,i.2O. l.P.Seq 


F 


M00004269A:FU 


CHOICOH 


1395 


15S25 


RT A000029 [ 6F.d. 12,1 .P.Seq 


F 


M00032553A:A07 


CHOSLNH 


1396 


7043 


RTA00002900F.h.07. LP.Seq 


F 


M000050l-3:F02 


CH02COH 


1397 


29354 


RT A00002905F.C. 13.1 .P.Seq 


F 


__M000079S;C:F07 


CH03MAH 


1398 


29703 


RTA00002907F.d. 24. LP.Seq 


F 


MOOO?214^C:E12 


CH03Nf,AH 


1399 


681 1 


RTA000029 1 3F,b 07 LP.Seq 


F 


M0002772O:D04 


CH04M.AL 


1400 


12657 


RTA00002906F.fr 20. LP.Seq 


F 


MO0O2l86cC:H0S 


CH03MAH 



\yt> 



WO 01/02568 



PCT/US00/18374 



ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


L03RARY 


1401 


2033 


RTA00002922F.em:,P,Seq 


F 


M0003902-iD:E12 


CH09LNL 


1402 


| 24229 


RTA00002920F.b*04, 1 P.Seq 


F 


M00033329C:C02 


CHOSLNH 


1403 


20664 


RTA00002886F.a,07. 1 .RSeq 


F, 


MOOOOI33SC:F05 


CHOICOH ! 


1404 


3656 


RTA00002902F. f.20. 1 .P.Seq 


F 


M00006641B:F05 


CH02COH 


1405 


10998 


RTA0000293 1 F.c.07. 1 .P.Seq 


F 


M0004287SD:G06 


! CH16COP 


1406 


1150 


RTA00002922F.j. 14. l.P.Seq 


F 


M000390SIB:G07 


CH09LNL 


1407 


I 45221 


RTA00002900F.h.06. 1 .P.Seq 


F 


M00005013D:H05 


CH02COH j 


1408 


j 34505 


RTA0000290 IF.a. 16. 1 .P.Seq 


F 


M00005423C:AIO 


CH02COH 


1409 


8175 


RTA00002924F.f.0 1 . 1 .P.Seq 


F 


M00039472B:E05 


CH09LNL 


1410 


8175 


RTA00002924F.e,24. 1 .P.Seq 


F 


M00039472B:E05 


CH09LNL 


1411 


19375 


RTA00002903F.n-011.RSeq 


F 


MOO0O7O8lB:C08 


CH02COH 


1412 


10866 


RTA00002929F.C 15- 1 .P.Seq 


F 


| M00040219B:B07 


CH14EDT 


1413 


24166 


RTA0000289 lF.k.07> l.P.Seq 


F 


M00003763A.B02 


CHOICOH 


1414 


! 15333 


RTA00002883F.C. 12. 1 P.Seq 


F 


M00001442C:G12 


CHOICOH 


1415 


44436 


RTA00002907F.b. 17, 1 .P.Seq 


F 


M00O22117C:A02 


CH03MAH 


1416 


9247 


RTA00002930F.a. 16, l.P.Seq 


F 


M00O4256OC;G06 


CH15CON 


1417 


12317 


RTA00002908F.?. 13. 1 .P.Seq 


F 


MOOO2243OC;C06 


CH03MAH 


1418 


11968 


RTA00002890F.L24. l.P.Seq 


F 


M00001625D-B04 


CHOICOH 


1419 


14181 


RTA00002908F.n.09.2.P.Seq 


F 


M00022499D:D08 


CH03MAH 


1420 


15359 


RT A00002909F. 1.02. 1 .P.Seq 


F 


M00022677C:C01 


CH03MAH 


1421 


46675 


RTA000029 l6F,h,03, l.P.Seg 


F 


M000325S-iA:D06 


CH08LNH 


1422 


24898 


RTA00002903F.k, 17. l.P.Seq 


F 


M00007019B:E01 


CH02COH 


1423 


156424 


RTA00002905F.m.22. 1 .P.Seq 


F 


M00021653A:B02 


CH03MAH 


1424 


11996 


RTA0000290 lRb.24. 1 .P.Seq 


F 


M00005445A:E07 


CH02COH 


1425 


11996 


RTA0000290 IF.c.Ol. 1 .P.Seq 


F 


M00005445A:E07 


CH02COH 


1426 


4784 


RTA00002894Re.20. l.P.Seq 


F 


M0O0O39SSD:B0i 


CHOICOH 


1427 


9120 


RTA00002914F.h. 10. l.P.Seq 


F 


M000282i0B:H03 


CH08LNH 


1428 


11295 


RTA00002890F.J. 15. l.P.Seq 


F 


M00001632C:A10 


CHOICOH 


1429 


3991 


RTA00002S96F.h,05. l.P.Seq 


F 


M00004162D:F02 


CHOICOH 


1430 


20353 


RTA00002908F.b,06. l.P.Seq 


F 


M00022367D:GU 


CH03MAH 


1431 


12823 


RTA0000292 IF.h.OL I .P.Seq 


F 


M00033434D:F05 


CH09LNL 


1432 


147419 


RTAQO0O2906F.g.05. LRSeq 


F 


M00021952B:G06 


CH03MAH 


1433 


12174 


RTA000029 l9F.f. 1 3. 1 .P.Seq 


F 


M00033071D:E08 


CH08LNH 


1434 


35608 


RTA00002897F.O.24. 1 .P.Seq 


F 


M00004296B:D03 


CHOICOH 


1435 


2325 


RTA00002894F.g.07. LRSeq 


F 


M00003994A:B10 


CHOICOH 


1436 


166261 


RTA0000290SF.I.05. LRSeq 


F 


M00022475D:C07 


CH03MAH 


1437 


5713 


RTA00002920F.a.09. 1 P.Seq 


F 


M00033324B:FO4 


CH08LNH 


1438 


3624 


RTA000029 L0F.g.06. 1 .P.Seq 


F 


M00022901A:C05 


CH03MAH 


1439 


10305 


RTA00002909Ra.07. l.P.Seq 


F 


M00022530B:CO4 


CH03MAH 


1440 


7768 


RTA000029 lOF.k.22. 1. RSeq 


F 


M00O22992B:G12 


CH03MAH 


1441 


9847 


RTAO00O29O8F.p.07. l.P.Seq 


F 


M00O22516B:C05 


CH03MAH 


1442 


8583 


RTA00002887F.O.06. 1 .P.Seq 


F 


M00001426C:F06 


CHOICOH 


1443 


24376 


RTA00002900F.b.07, LRSeq 


F 


M00004S36B:C02 


CH02COH 


1444 


8743 


RTA00002907F.n. 19. l.P.Seq 


F 


MOOO:2262A:F06 


CHOjMAH 


1445 


22251 


RTA00002926F.c.l0.2.P.Seq 


F 


M00040079B:F06 


CH09LNL 


1446 


12337 


RTA0000292SF.d.07. l.P.Seq 


F 


M00040173D;A04 


CH13EDT 


1447 


13623 


RTA0000291 lF.d.08.2.P.Seq 


F 


M00026S42B:A01 


CH04MAL 


1448 


5521 


RTAOO002887FJ.O6. 1 .P.Seq 


F 


M00001406B:H09 


CHOICOH 


1449 


2193 


RTA00002933F.a, 13. l.P.Seq 


F 


M0004307~B:Fll 


CH19C0P 


1450 


773 


RTA00002SS9FJ.02, 1 .P.Seq 


F 


M0000155LD:H09 


CHOICOH 




WO 01/02568 



PCT/US00/18374 



1 ID 


CLUSTFR 








LIBR.ARY 


140 I 


i AnnAn 

142367 


nT • aaaaa AATC u l l i n i* 

RTA00002927F.il. 11. 1 .P.i>eq 


F 


M00039630D:B07 


CH12EDT 


1402 


1 010.1 

l> 284 


RTA000028o9r.e. 10. 1 .P.Seq 


F 


M0000lo3/B.H10 


CH01COH 


1 AS1 


n4f\i i 
2401 1 


RTA00002924F i c i 17, 1 .P.Seq 


F 


M00039440CG06 


CH09L>fL 


1 ASA 

1404 


<Q*in 
5930 


RTA000029 1 lr t-08. 1 ,P.Seq 


F 


M000269I0B:G06 


CH04MAL 


1 AS< 

1400 


n I <o i 
21081 


RTA00002902F-C.05. l.P.Seq 


F 


M00OO5822C;AO4 


CH02COH 


1 ASA 
1 4 JO 


3662 


RTA00002925F.C.07. 1 .P.Seq 


F 


M00O39826D;EO4 


CH09LNL 


1 ASH 

143 / 


4873 


RTA000029 j0Fo.05, 1 .P.Seq 


F 


M000427l9A:GO8 


CH15CON 


1408 


1 1214 


RTA00002896F,h. 01.1. P.Seq 


F 


% f AAAA ft ft ■ a. — . — — . 

M00004161A:E08 


CH01COH 


1 1 CO 

i4oy 


22888 


RT A00002892F.1.09. 1 .P.Scq 


F 


M0000383?C:D10 


CH01COH 


1460 


15490 


RTA00002925F.k.08. 1 .P.Scq 


F 


M00039932B;A07 


CH09LNL 


146 1 


1 12819 


RTA00002905F.O- 13. 1 .P.Scq 


F 


M0002i676C:G03 


CH03MAH 


1462 


19688 


RTA00O02896F.I.02. 1. P.Scq 


F 


M00004179D:A12 


CH01COH 


1463 


lo!32 


RTA00002922F.n.20. 1. P.Scq 


F 


M0003913SB:G05 


CH09LNL 


1 A£\A 

1464 


25022 


RTA000029 14F.1.2 1 . 1 .P.Seq 


F 


M00028219B:H05 


CH08LNH 


1465 


16303 


RTA00002888F.b. 12. 1 .P.Scq 


F 


M00001433A:E01 


CH01COH 


1 A A. A. 

1400 


16828 


RTA00002897F.D.04. 1 .P.Seq 


F 


M00004214A:E05 


CH0ICOH 


1 A An 

1467 


14295 


RTA0000292 1 F. a. 1 8- 1 .P.Scq 


F 


M00033296C:C11 


CH09LNL 


1 A AO 

1468 


1979 


RT A00002930F. 1,06, 1. P.Seq 


F 


MO0O55725D:DO9 


CH15C0N 


1 AA(\ 

1469 


36248 


RTA00002338F.g.Oo, 1. P.Seq 


F 


M00001460CEIO 


CH01COH 


1470 


5676 


RTA00002926F.b.22.2.P.Seq 


F 


M00040075B:A05 


CH09LNL 


1471 


1239 


RTA00002887F.O.2 1.1. P.Scq 


F 


M0000142SB:C10 


CH01COH 


1472 


7937 


RTA000029 1 7F.g.22- 1 .P.Scq 


F 


M0003272SD:FOl 


CH08UMH 


1473 


4483 


* I 4 r« # AM .m 


F 


M00026856B:G03 


CH04MAL 


i ah t 
1474 


7796 


RT A00002925F.C. 05. L P.Scq 


F 


M00039826B:F09 


CH09LtNL 


1 A1C 

1475 


17330 


RTA000029 loF.a.Oj. 1 .P.Scq 


F 


M00028616C:D09 


CH08LNH 


1 An A 

14/0 


25620 


RTA00002902F.f.09. 1 .P.Scq 


p 


M00006631C:A04 


CH02COH 


i Ann 
14/ / 


AA£ A i 

20601 


RTA00002923F.L20. 1 P.Seq 


F 


M00039326A:G07 


CH09LNL 


1 AHO 

1478 


6205 


RTA00002923F.g.2 i. 1 .P.Seq 


F 


M0003925SC:C01 


CH09LNL 


1479 


726 


RTA000029 l3F.b. 16. 1. P.Seq 


F 


M00027734D;C03 


CH04MAL 


t A OA 

1480 


ft A .* A A r% 

104999 


RTA00002908F.g. 17. 1. P.Seq 


F 


MO0022435B:G12 


CH03MAH 


1481 


30321 


RTA00002919F.0. 17. 1 P.Seq 


F 


M00033264B:E06 


CH08LNH 


1482 


5878 


T\ *T* ft AAA /^''l 1 "l T~» ft X ■ n n 

RTA000029 13F.a> 16, 1 P.Seq 


F 


M000276SSC:C01 


CH04MAL 


1483 


5944 


T\T «. rtAAA'AA J 1- AT ft rs. 

RTA00002905F.m.07. 1 P.Seq 


F 


M00021649B:A02 


CH03MAH 


1 A QA 

1484 


5796 


AT * AAAAAAAOT? ' 1 1 1 A C 

RTA0O0O29O8F.1 .2 1 , 1 .P.Seq 


F 


M0002245*A:G05 


CH03MAH 


148 J 


3804 


RTA0O0O29^oF.m,24. 1. P.Seq 


F 


MOOOoo254A:H03 


CH17COHLV 


1 A QA. 

1480 


A'VlO ■ 

2728 


AT \ AAArtOfl | or . ti 1 r» c 

RT A0O0O29 1 8F.a.22. 1 .P.Seq 


F 


M0003282^A;A06 


CH08LNH 


1 AQH 

148 / 


1 OA/4 

3804 


A"T" v AAAAIA^ cr ^ . m ■ r» O 

RTA0000293oF.n.0 1.1. P.Seq 


F 


M0OOoo2^-1A:H0j 


CH17COHLV 


t A QQ 

1488 


39.?2 


AT \ AAA AAA 1 CT" in -\ n P 

RTA0O0O29 1 oF.o. 19. 2. P.Seq 


F 


m m AAA **• * 1 ^ •■*•■ v*^ ft A 

MOOOj2o1 l,;E10 


CH08LNH 


1 A QQ 

i48y 


1 A£.t\ I 

16691 


A T * AAAA^ OA 1 P 1 fi i"* 

RTA0O00289 IF.o.Oj. LP,i>eq 


F 


M0000j7SCA;G01 


CH01COH 


f ,1 AA 

1490 


15430 1 


RTA0OOO2900F,g. 10. KP,Seq^ 


F 


MOOOO5003D:C02 


CH02COH 


i ah i 
l4y I 


5637 


ri'T % /\AAAA A A ^* i_ i rt > n r> 

RTA0000292oRb. 18. 1 .P.Seq 


F 


M000j9S:0B:F06 


At »Ar>r v rr 

CH09LNL 


i a on 

I4yz 


I AA1 1 

16633 


TS ^» * AAAAAO A^Jlf ^. i C i A l 1 

RTA0OOO2S97Fg. 15. 1 .P.^eq 


F 


M0000424c3:H07 


CH01COH 


i4yj 


21826 


RTAOO0O2S98F.g,06- P.Seq 


F 


% ft* A A A A 4 ^> ft 4 » 1 1 

M00004j4-A:G1 I 


f v A 1 A*~* T 7 

CH01COH 


1494 


2^193 


RTA0OOO29 1 9F.L09. 1 -P.Seq 


F 


IV1UUU_> J 1 L4ClJ.»*\UJ 


UriUoL-lNri 


1495 


10720 


RTA0O0O2898F.C. 14. 1 .P.Seq 


F 


M0000432CC:E07 


CH01COH 


1496 


22491 


RTA00002925F.m.06. 1. P.Seq 


F 


M00040003A:GIO 


CH09LKL 


1497 


10423 


RTA000029 15F.n. 13.2. P.Seq 


F 


M0003:50'D:G08 


CH08LNH 


1498 


4953 


RTA0OOO29l6F.h.IU.P.Seq 


F 


M0003:5ScC:B04 


CH08LNH 


1499 


185567 


RTA000029UF.p.08.LP.Scq 


F 


M00027I7S3:A11 


CH04MAL 


1500 


25605 


RTA00002924F,m.22. 1 P.Seq | F 


M0003971C3:A0l ! 


CH09LNX 



WO 01/02568 



PCT/US00/18374 



ID 


CLUSTER 


SEQ^NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1501 


29446 


RTA00002906F.n*24. lJ».Sea 


F 


M00022070B:B04 


CH03MAH 


1502 


9668 


RTA00002908F.g.02.1.P.Seq 


F 


M00O2242lA:F12 


CH03MAH 


1503 


29446 


RTA00002906F.ii.01.lj.Sea 


F 


M0OO2207OB:BO4 


CH03MAH 


1504 


7171 


RTA00002887F.m^2.1J».Sea 


F 


M00001421B:E07 


CH01COH 



WO 01/02568 



PCTAIS00/18374 



Table 3 





Nearest 


Neighbor (BlastN vi, Genbank) 


Nearest Neighbor fBlasiX vs. Non-Redundam Protein* t 


SEQ 














ID 


ACCESSION 




p vat f'p 


ALLfcSSIQN 


DESCRIPTION 


P VALUE 


I 


<NONE> 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 




2 


<NONE> 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


<NONE> 


3 


<NONE> 


<NONE> 


<JN0NE> 


<N0NE> 


<NONE> 


<P»UN£> 


4 


<NONE> 


. <NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


5 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<XN<JINE> 


6 


<NONE> 


<N0NE> 


<N0NE> 


<NONE> 


<NONE> 


<nowc> 


7 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




S 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




9 


<NONE> 


<NONE> 


<NONE> 


1 <N0NE> 


] <NONE> 




to 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


<NONE> 


<PiUN£> 


LI 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<TNlJl\r , i > 


12 


<NONE> 


<NONE> 


<N0NB> 


<NONE> 


<NONE> 


<fNUlNfc> 


13 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


\ <NONE> 




14 


j <NONE> 


<NONE> 


cNONE> 


<NONE> 


<NONE> 




| 15 


<NONE> 


<N0NE> 


<N0NE> 


<NONE> 


<NONE> 


<lNUi\E> 


16 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<r*UNh> 


17 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




IS 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




! 19 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


<NONE> 




20 


<NONF> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 




21 


<NONE> 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


<NO\E> 


22 


<NONE> 


<NONE> 


<N0NE> 


<NONE> 


<NONE> 














GENOME POL VPRO i t. LN 














[CONTAINS: RNA 














REPL1CASE ; HELICA5E: 














COAT PROTEIN] 2.7.7.48) - 














apple stem grooving virus 




23 


<NONE> 


<NONE> 


<NONE> 


548562 


(strain P-209) 


9.: 












EXCISION REPAIR PROTEIN 














ERCC-6 DNA repair hel lease 














ERCC6* human >gi|l82LSl 














[L04791) excision repair protein 




24 


<N0NE> 


<NONE> 


<N0NE> 


416959 


Homo sapiens] 




25 










(ABO 14541) KIAA0641 protein 




<;NONE> 


<NONE> 


<NONE> 


3327096 


Homo sapiens] 


8~ 












(U2874DF35D2.1 gene 














product [Caenorhabditis 




26 


<NONE> 


<NONE> 


<NONE> 


861293 


elesans] 


7.9 












fAL03lO32)extensin-like 




27 


<NONE> 


<NONE> 


<NONE> 


3297821 


protein 


5.5 












transforming growth factor- beta 














type III receptor - chicken 














>gi|5 11343 (L01121) 














transforming growth factor-beta 




28 


<NONE> 


<NONE> 


<NONE> 


2119692 


type III receptor (Galium callus) 


5.1 


29 


<N0NE> 


<NONE> 


<NONE> 


213602$ 


protein kinase PRKl - human 


5.0 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundam Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















30 


<NONE> 


<NONE> 


<NONE> 


2746912 


( AF040659) No definition line 
found [Caenorhabditis elegans) 


4,6 


31 


<NONE> 


<NONE> 


<NONE> 


2358287 


(AF0 10404) ALR [Homo 
sapiens) 


4.5 


32 


<NONE> 


<NONE> 


<NONE> 


38778 16 


(Z96048) predicted using . 
Gencfinder; cDNA EST 
EMBL:D65516 comes from this 
gene; cDNA EST ykl91a5.5 
comes from this gene 
[Caenorhabditis clegans] 


4.4 


33 


<NONE> 


<NONE> 


<NONE> 


4140268 


(Y14953) SRCR domain, 
membrane form 2 


4.1 


34 


<NONE> 


<NONE> 


<NONE> 


1708663 


(U51 183) transposase [Hydra 
vulgaris] 


4.0 


35 


<NONE> 


<NONE> 


<NONE> 


\ 1184100 


(U45958) pistil extensin-like 
protein [Nicotiana alata] 


3.9 


36 


<NONE> 


<NONE> 


<NONE> 


121073 


GLUCOCORTICOID 
RECEPTOR (GR) 


3.9 


37 


<NONE> 


<NONE> 


<NONE> 


1718298 


(U75698) ORF 45; contains an 
extended acidic domain; EB V 
BKRF4 homolog [Kaposi's 
sarcoma-associated herpesvirus] 
homolog, conserved in other 
prnma-herpesvinises 


2.6 


38 


<NONE> 


<NONE> 


<NONE> 


2352538 


(AF006564) alcohol 
dehydrogenase [Drosophila 
persimilts] persimifis] 


L4 


39 


<NONE> 


<NONE> 


<NONE> 


3192897 


(AF066071)SP85;PsB 
[Dictyostelium discoideuml 


1.4 


40 


<NONE> 


<NONE> 


<NON£> 


561645 


(L33421) This CDS feature is 
included to show the translation 
of the corresponding V_region. 
Presently translation qualifiers 
on V region features are illegal 


1.0 


41 


<NONE> 


<NONF> 


<NONE> 


3878S57 


(ZXJiflJ) predicted using — 
Gencfinder. cDNA EST 
EMBL:D35016 comes from this 
gene; cDN A EST 
EMBL:D32583 comes from this 
gene; cDNA EST 
EMBL:D35258 comes from this 
gene; cDNA EST 
EM8L:C1 147 1 comes from this 
gene; cDNA EST EMBL:C... 


1.0 



111 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundam Proteins) 


SEQ 
CD 


ACCESSION 


DESCRIPTION 


PVAJLUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U75903) UGT1A7 [Rattus 




42 


<NONE> 


<NONE> 


<NONE> 


1658571 


norvegicus] 


1.0 


43 


<NONE> 


<NONE> 


<NONE> 


2338034 


(AF005370) putative immediate 
early protein (Alceiaphine 
herpesvirus 1] 


0.86 


44 


<NONE> 


<NONE> 


<NONE> 


3043714 


(AB011167) KIAA0595 protein 
[Homo sapiens] 


0,42 


45 


<NONE> 


<NONE> 


<NONE> 


1723710 


PROTEIN IN ASN2-PHB 1 
INTERGENIC REGION 
>»t2131678|pir||S64439 
hypothetical protein YGR130c * 
yeast (Saccharomyces 
cercvisiae) 

>gi|l3232 15|gnl |PID|e243523 
(272915) ORFVGR130c 
[Saccharomyces ccrevisiae] 


0.40 




<NONE> 


<NONE> 


<NONE> 


1723710 


III iaJ l rLC. i ILrti- n±s 

PROTEIN IN ASN2-PHB 1 

>gi|2l3l678|piii|S64439 
hypothetical protein yukuuc - 
yeast (Saccharomyces 
cerevisiae) 

>gi|13232 15|gnl|PID|e243523 
(Z729 1 5) ORF YGR 1 30c 
f Saccharomyces cerevisiae] 


0.38 


47 


<NONE> 


<NONE> 


<NONE> 


2996117 


(AF046125) immediate early 2 
[Rai cytomegalovirus} 


0.26 


48 


<NONE> 


<NONE> 


<NONE> 


4151809 


(AF102855) synaptic SAPAP- 
intcractmg protein Synamon 


0.024 


49 


<NONE> 


<NONE> 


<NONE> 


2773341 


(AF040954) putative protein 
phosphatase 1 nuclear targeting 
subunit [Rattus norvegicus] 


0.017 


50 


<NONE> 


! .V T/^X 11"' 

<NONE> 


<NOJM£> 


1653522 


(D90914) hvpotheticat protein 


1*. rwi 


51 


<NONE> 


<NONE> 


<NONE> 


3219965 


HYWBiEtiCaI xobbkD 

TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6.04C IN CHROMOSOME 
I 


3e*06 


52 


<NONE> 


<NONE> 


<NONE> 


4185567 


(AF 115480) cAMP-dependem 
Rapl guanine-nucleotide 
exchange factor [Mus museums] 


7e-07 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (B)astX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












rlVP0TffiTtCAt"43iK!6 ' 




53 


<NONE> 


<NONE> 


<NONE> 


U76527 


PROTEIN CkElO.l IN 
CHROMOSOME HI 
>gi|500724 (U10402) C34E10.1 
gene product {Caenorhabditis 
clcgansl 


3c-20 


54 


X85444 


G.pallida repetitive 
DNA element 


5.0 


2118936 


beta-globin - chimpanzee 
(fragment) 


8.6 


55 


X7296L 


Synechococcus sp. 
cpeB, cpeA genes and 
ORF3 


5,0 


462569 


MICROTUBULE 

ASSOCIATED PROTEIN 1A 
microtubulc-associated protein 
MAplA-rat>gi|2Q5538 
norvc2icus]_ 


2.2 


56 


U94747 


Human WD repeat 
protein HANI 1 
mRNA. complete cds 


5,0 


3875538 


(267990) similar to cuticle 
collagen 


13 


57 


AF032108 


Homo sapiens 
integrin aJpha-7 
mRNA, complete cds 


5.0 


2147194 


collaeen - Paralvinella grass lei 


0.002 


58 


Z50798 


G.gaJlus mRNA for 
p52 


5.0 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
[Bacillus subtilis] 


3e-ll 


59 


AB002384 


Human mRNA tor 
KlAA0386gcne f 
complete cds 


5.0 


2632098 


(Y15513) Prodos protein 
{Drosophila melanogaster] _ 


9e42 


60 


X14835 


Thermofilum pendens 
DNA for 16S and 
23S ribosomal RNA, 
tRNA-Mec. and tRNA 
Gly 


4,9 


<NONE> 


<NONE> 


<NONE> 


61 


U87149 


Hordcum vulgare 
nucellin gene, 
complete cds 


4.9 


128578 


N6N$taUCTUfcAL 
PROTEIN NS-S spotted wilt 
virus (strain CPNH1) non- 
structural protein [Tomato 
sported wilt virus] 


2.8 


62 


D87541 


Miis musculus gene 
for integrin alpha v 
subunit, promoter 
region 


4.9 


136956 


HYPOTHETICAL PROTEIN 
UL6I cytomegalovirus (strain 
AD 169) cytomegalovirus] 


0.038 


63 


U72520 


Mus muse ul us mcna 
protein (Mena) 
mRNA. complete cds 


4.9 


3413892 


(AB007934) KIAA0465 protein 
[Homo sapiensl 


6c-07 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 

ITS 

ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















64 


S79797 


enzymatic 
glycosylation- 
regulating gene [rats, 
Sprague-Dawley, 
streptozotocin 
diabetic* heart, 
mRNA, 5010 ntl 


4.8 


<NONE> 


<NONE> 


<NONE> 


65 


AB011102 


Homo sapiens mRNA 
forKIAA0530 
protein, partial cds 


4.8 


138022 


RECiitff 6k k£C66Nl£*N6 
PROTEIN gp38 - phage 0x2 
>gi|15126(XQ5675)gcne38 
(AA 1-266); pid:gl5l26 
(Bacteriophage 6x2] 


3.6 


66 


AF1009S5 


Penaeus monodon 
phosphopyruvate 
hydratase mRNA, 
complete cds 


4,8 


500615 


(D16221) endochitinase (Oryza 
satival 


2.8 


67 


U31756 


Bacillus subtlHs 
gamma- 
aminobutyrate 
permease cds 


4-8 


3880699 


(AL02 1471) similar to 
Eukaryotic aspartyl proteases 
[Caenorhabditis elegans] 
Eukaryotic aspartyl proteases 
[Caenorhabditis eiegans] 


2.8 


68 


U25I11 ! 


Pisum sativum 
chloroplast 
processing enzyme 
mRNA, nuclear gene 
encoding chloroplast 
protein, complete cds. 


4.8 


1800145 


(U83658)FH1/FH2 protein 
homology [Erncricella nidulans] 


1,6 


69 


U00454 


Mus musculus Cdx-2 
homeobox protein 
gene, complete cds. 


4.7 


<NONE> 


<NONE> 


<NONE> 


70 


M84166 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


4.7 


1710606 


REN1N-BINDING PROTEIN 
(RNBP) protein [Rattus 
norvegicusl 


0,88 


71 


AF037516 


SJus musculus major 
sperm fibrous sheath 
protein Pro- 
mAKAP82 gene, 
alternative splice 
exons T and I" 


4.6 


<NONE> 


<NONE> 


cNONE> 


72 


X74160 


M.esculenta mRNA 
for granule-bound 
Starch synthase 


4,6 


<N0NE> 


<NONE> 


<NONE> 


73 


M97487 


Haloferax volcanii 
superoxide dismutase 
;sod2) gene, complete 
cds. 


4.6 


2623307 


(AC002409) putative ubiquitin 
protease f Arabidopsis thaliana] 


3.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Drosophila 










74 


M57889 


melanogaster 
suppressor of sable 
gene, complete cds. 


4.5 


<NONE> 


<NONE> 


<NONE> 


75 


D49708 


Rattus norvegicus 
mRNA for RNA 
binding protein 


4.5 


<NONE> 


<NONE> 


<NONE> 


76 


D31853 


Yeast GTS I gene for 
glycin-threonin/serine 
repeat protein, 
complete cds 


4.5 


. 2447195 


(U42580) NETTF (7x), DETTS 
(4x) [Paramecium bursaria 
Chlorella virus 1] 


3.3 


77 


Z47036 


Human partial cDNA 
sequence, clone 
bs6l3; 


2.9 


<NONE> 


<NONE> 


<NONE> 


78 


L19660 


Rattus norvegicus 
gastric inhibitory 
peptide receptor 
mRNA, complete cds 


1.1 


2358279 


(AP00787 1 ) torsinA [Homo 
sapiens] 


2e-07 


79 


X82841 


A.thaliana Aco gene 


2.6 


483212 


immediate-early protein IEllO • 
human herpesvirus i (strain 
HFEM) (fragment) 


8.4 


80 


X61931 


S.purpurascens famA 
and famB genes for 
FAS domain and acyh 
CoA-dehydrogenases, 
respectively 


2.6 


2290534 


(U95031) sublingual gland 
mucin [Homo sapiens] 


047 


81 


U13680 


Human lactate 
dehydrogenase-C 
(LDH-QmRNA, 
complete cds, 


2.5 


2887449 


(AB007874) KIAA0414 [Homo 
sapiens] 


31 


82 


AB007869 


Homo sapiens 
KIAA0409 mRNA, 
partial cds 


2.4 


3130157 


(AB008859) pheromone 
receptor [Fugu rubripes) 


5.4 


83 


X97479 


Hsapiens mas proto- 
oncogene, 5' region 


2.1 


<NONE> 


<NONE> 


<NONE> 


84 


X98374 


R.norvegicus mRNA 
for KIS protein 


L9 


<NONE> 


<NONE> 


<NONE> 


85 


AE000710 


Aquifex aeolicus 
section 42 of 109 of 
the complete genome 


1.9 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
if* 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens mRN A 










86 


D30612 


for repressor protein, 
partial cds 


L9 


<NONE> 


<NONE> 


<NONE> 


87 


Y 1432 I 


Homo sapiens 
PMP69 gene, exons 
8,9,10 St 11 


1.9 


<NONE> 


<NONE> 


<NONE> 


88 


D90773 


E.coli genomic DNA, 
Kohara clone 
#262(30,300.5 min,) 


L9 


1536816 


(D78305) DNA binding protein 
[Chlorella virus] 


7.9 


89 


AE000991 


Archaeoglobus 
fulgidus section 1 16 
of 172 of the ' 
complete genome 


1.9 


520645 


(X79095) 

pyruvatcorthophosphate 
dikinase [Flaveria trinervia] 


2.7 


90 


U39476 


Ratrus norvegicus 
p95 Vav (Vav) proto- 
oncogene mRNA. 
complete cds. 


1.9 


4158178 


(AL023496) hypothetical 
protein 


1.6 


91 


U2883C 


Human transcription 
factor TFIIIB 9GkDa 
subunit 


1.9 


2495730 


HYPOTHETICAL PkOLlNE- 
RICH PROTEIN Kl AA0269 
>gi|1665805|gnl|PID|dl0l4089 
(D87459) Similar to Volbox 
caneri extensin ($22697) 
[Homo sapiens] 


0.23 


92 


U20106 


Rattus norvegicus 
synaptotagmin VII 
mRNA* complete cds, 


1.9 


478380 


UL47h protein - Marek's disease 
virus 


0.23 


93 


AF071010 


Mouse mammary 
tumor virus putative 
integrase, env 
pdyprotcin, and 
supcrantigen mRN A, 
complete cds 


1.9 


' . 2781386 


(AC004010) similar to Leucine- 
rich transmembrane proteins; 
44% similarity to U42767 
(PID:gl736918) [Homo 
sapiens] 


4e-33 


94 


AF061881 


Mesocricetus auratus 
c-fos proto-oncogene 
protein (c-fos) gene, 
complete cds 


1.8 


<NONE> 


<NONE> 


<NONE> 


95 


AE001397 


Plasmodium 
falciparum 
chromosome 2, 
section 34 of 73 of 
the complete 
sequence 


1.8 


<NONE> 


<NONE> 


<NONE> 



to* 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Horseshoe crab 










96 


D 1470 I 


mRNAfor 
coagulation factor B, 
complete cds 


i o 
l.O 


<NUNe> 




<lNVJINIi> 


97 


M29L54 


^falciparum 
multidrug resistance 
(MDR) gene, 
complete cds. 


L8 


<NONE> 


<NONE> 


<NONE> 


Oft 


L 165 32 


Rattus norvegicus 
(clone pciNru; l ,j - 
cyclic nucleotide 3'~ 
phosphodiesterase 
(CNPII) mRNA, 
complete cds. 


L8 * 


<NONE> 


• 

<NONE> 


<NONE> 


99 


AE001434 


Plasmodium 
falciparum 
chromosome 2, 
section 71 of 73 of 
the complete 
sequence 


1.8 


<NONE> 


<NONE> 


<NONE> 


100 


Z46785 


D.melanogasier gene 
for protamine 
(rnst35Bb). 


1.8 


<NONE> 


<NONE> 


<NONE> 


mi 

IVl 


X69822 


Rsylvcstris mRNA 
for gtutamine 
synthetase 


i.8 


219896 


i-caiOCSmOn L [rlOmO 

sapiens] 


9.7 


102 


U49055 


Rattus norvegicus 
CTD-binding SR-likc 
protein rA8 mRNA, 
complete cds 


1.8 


2497252 


lNSULlN-LlKhGRUWlH 
FACTOR BINDING PROTEIN 
4 HGFBP-4^ fIBP-4) (IGF- 
BINDING PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor-binding 
protein-4. IGFBP-4 {sheep, 
liver. Peptide, 237 aa] [Ovis 
aries) 


2.5 


103 


L28101 


Homo sapiens 

ivuiiisuuin \riH^ gene, 

exons 1-4. complete 
cds 


1.8 


4204267 


(AC005223) 55585 
[Arabidopsis thalianaj 


2,4 


104 


U66987 


Pandorina morum 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene, 
and internal 
transcribed spacer 2 T 
complete sequence 


1.8 


2635909 


(299121) permease [Bacillus 
subtilis] 


i.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human polymorphic 










105 


X58033 


Mspl site DNA 
(D3S3 locus) 


1.8 


2136878 


keraun NAP5 5 - sheep 
(fragment) >ei|3 13722 


0.65 


106 


U157S0 


Human p82 (ST5) 
mRNA* alternatively 
spliced, complete cds 


1.8 


3638957 


(AC004877) sco-sponom-mucin- 
like; similar to P98167 uncertain 
[Homo sapiens] 


0.64 


1 A*7 
IU/ 


AF038535 


Homo sapiens 
synaptotagmin VII 
mRNAjJartial cds 


1.8 


457927 


(U00690) calcium channel alpha 
1 subunit [Drosophila 
melanogaster] 


0.51 


1 AO 

108 


AF052134 


Homo sapiens clone 
23585 mRNA 
sequence 


1.8 


232263 


HOMEOBO a rKU I fclN rifJX- 
Dl (HOX-4.9) 


0.28 


109 


X75208 


H.sapiens HEK2 
mRNA for protein 
tyrosine kinase 
receptor. 


1.8 


1730198 


GROWTH- ARiu:^-^rcClrl«w 
PROTEIN 1 gene product 
[Homo sapiens] 


0.22 


no 


AB013896 


Xenopus laevis 
mRNA for SOX-D, 
complete cds 


1.8 


2494501 


TRANSCRIPTION FACTOR 
FKH-4 factor [Mus musculus[ 


0.17 


111 


D 16947 


Human HepG2 3* 
region cDNA* clone 
hmd6bl0 


1.8 


3413870 


(AB007923) KlAA0454 protein 
(Homo sapiens] 


0.002 


112 


D13547 


Mouse DNA, T early 
alpha (TEA) region 


1.3 


3393018 


(AL031174) hypothetical 
protein 


5e-08 


113 


M35498 


Woodchuck c-myc 
protein gene, exon 1. 


1.8 


• 

3183405 


PROTEIN C2C6.07 IN 

CHROMOSOME I 

>gi /\jd\f* |gni|riA-'|ej iy** 

pombe] 

>gij345 1305|gnl|PID|el3 16730 
(AL031324) very hypothetical 
protein [Schizosaccharomyces 
pombel 


8e-10 


114 


XAQA } £JL 

M641DO 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


i ft 




(AC0O4665) unknown protein 

f A r-\KiHf\n«;it thaliartnl 
j /\T UQlUVJpola lllalldlltfj 


2e-l0 


115 


U33135 


Mychodea carnosa 
ISSribosomal RNA 
gene, complete 
sequence 


1.8 


3334982 


(AC005306) R27216J [Homo 
sapiens) 


3e~22 


116 


U84003 


Homo sapiens 
putative tumor 
suppressor (BIN1) 
gene, exons 7-12 


1.7 


<NONE> 


<NONE> 


<NONE> 



ni 
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Nearest Neiehbor fBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs, Non-Redundant Proteins) 


SEQ 
CD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















117 


AE001121 


Borrelia burgdorferi 
(section 7 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


118 


AE001114 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


119 


U82064 


Angiostrongylus 
cantonensis adult- 
specific muscle 
protein- 1 gene, partial 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


120 


AF041836 


Buchnera aphidicola 
plasmid pLeu-Sg, 
complete plasmid 
sequence 


1.7 


<N0NE> 


<NONE> 


<NONE> 


121 


M87479 


Lymnaea stagnalis 
FMRFamide gene, 
mature peptides. 


1.7 


<NONE> 


<NONE> 


<NONE> 


122 


M55163 


Xenopus laevis 
fibroblast growth 
factor receptor 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


123 


S57565 


histamine H2- 
receptor [rats. 
Genomic, 1928 nt] 


1.7 


<none> ; 


<NONE> 


<NONE> 


124 


M27256 


Simian 

immunodeficiency 
virus (SIV) pol 
region. 


1.7 


<NOtfE> 


<NONE> 


<NONE> 


125 


U31516 


Human chromosome 
8 anonymous clone 
pBS8-l65 


1.7 


<NONE> 


<NONE> 


<NONE> 


126 


X 1267 1 


Human gene for 

heterogeneous 

nuclear 

ribonuclcoprotein 
(hnRNP) core protein 
Al 


1.7 


<NONE> 


<NONE> 


<NONE> 


127 


AF009054 


Paeonia suffruticosa 
ssp. spontanea 
alcohol 

dehydrogenase IB 
(AdhlB) gene, partial 
cds 


L7 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















128 


AF046917 


Mus rnuseulus 
transketolase gene, 
exon 6 and partial cds 


17 


<NONE> 


<NONE> 


<NONE> 




D89053 


Homo sapiens mRNA 
for Acyl-CoA 
synthetase 3, 
complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


130 


U57968 


Stapnylothermus 
marinas surface layer- 
associated o l ajjlx. 
protease gene, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


131 


L39072 


Bovine herpesvirus 1 
(clone pw>) 
homologue gene, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


132 


X04980 


urosopnua simuiani 
retrotransposon 297 
5*-LTR and flanks 
(pWK1020) 


1.7 


<NONE> 


<NONE> 


<NONE> 


133 


AE001114 


Arcbaeoglobus 
rulgiaus section ioj 
of 172 of the 
complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


1 -1A 
1 P*+ 


X04434 


Human mRNA for 
insulin* tike growth 
factor I receptor 


1.7 


<NONE> 


<NONE> 


<NONE> 




U07890 


Mus musculus 
C57BL/6J epidermal 
surface antigen 
(mesa) mRNA, 
complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


136 


D26I63 


Human tyrosinase 
gene, 5 -flanking 
region cell-specific 
transcription) 


1.7 


<NONE> 


<NONE> 


<NONE> 


137 


AF093818 


fanorpa nipponensis 
NADH 

dehydrogenase 
subunit 5 gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


1.7 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Xcnopus lac vis 










138 


D50560 


mRNA for 
cytochrome 
complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


139 


AF083488 


Mus musculus 
phospholipase Dl 
(PLD1) gene, exons 
18 and 19, complete 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


140 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


. 1.7 


. <NONE> 


<NONE> 


<NONE> 


141 


M73749 


Streptococcus 
salivarius 

tbermophilus beta-D- 
galactose (lacZ) gene, 
complete cds. > :: 
gb|M63636|STRLAC 
ZZ Streptococcus 
tbermophilus bcta*D- 
galactosidase (lacZ) 
gene, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


142 


AE001 1 14 


Archaeoglobus 
fulgidus section 165 
of 1 7^ of the 
complete genome 


L7 


2183023 


(U84971) unknown [Homo 
sapiens] 


9.2 


143 


L01983 


Human type IV 
sodium channel alpha 
polypeptide 


1,7 


130504 


"GENOME K3LYPRU1 blN 
[CONTAINS: N-TERMINAL 
PROTEIN (PI); HELPER 
COMPONENT PROTEINASE 
INCLUSION PROTEIN (CI); 6 
KD PROTEIN 2 (6K2); 
GENOME-LINKED PROTEIN 
(VPG); NUCLEAR .„ vims 
(strain D) ' 


9.2 


144 


L 19731 


Plecotus rafinesquii 
mitochondrial 
cytochrome b gene. 5 r 
end. 


1.7 


3327096 


(AB014541) KIAA0641 protein 
[Homo sapiens] 


9.1 


145 


AEOOl 1 14 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


2183023 


(U84971) unknown [Homo 
sapiens] 


8.8 
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Nearest Neiehbor (BlastN vs, Gcnbank) 


Nearest Neighbor (BlastX vs, Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















146 


L27218 


Bos taurus serum 
amine oxidase 
mRNA, complete cds. 
> ox)dase=amiloride* 
binding protein 
homoiog [cattle, liver. 
mRNA, 2664 nt] 


17 


1174459 


SIGNAL TRANSDUCER AND 
ACTIVATOR OF 
TRANSCRIPTION 6 (IL-4 
STAT) >gi|559855 (U16031) IL 
4 Stat [Homo sapiens] 


7.1 


147 


Z49500 


Caenorhabditis 
elegans cosrnid 
W07E1L complete 
sequence 
[Caenorhabditis 


1 7 




(AC005223) 40409 

f A i*^hirlnnci c f nil 1 in nut 


6.7 


148 


A T /VJO*l*7 1 

AJLAU44 / 1 


Caenorhabditis 
elegans cosmid 

rj-rA, Cumpicic 

sequence 

[Caenorhabditis 

elegans] 


1 7 
I./ 


7dQ7<)f>Q 

MY? f 707 


PERIPLASMIC NITRATE 
REDUCTASE PRECURSOR 
>gi|1086l07|pir||S50163 nitrate 
reductase large chain precursor, 
pcriplasmic - Thiosphaera 
pantotropha >gi)600093 
(Z36773) periplasmic nitrate 
reductase large subunit 

rPnmi'fwrti* d^nitritican^l 


6.7 


149 


U43o44 


Mus musculus cyciin 
D3 gene, complete 
cos 


i 7 




(AF062037) capsid protein 
r%ri*r nrcrti* rTho^a asiffna vinisl 

piCt-UIaUI i M, ilujwU UJiitiiO Tiiujj 


5.1 


150 


Z25464 


Scerevisiae UNF1, 
LTVLMRP8, CYB3 
andTGLl genes, 
complete CDS's 


L7 


1255404 


(U53151) weak similarity to 
cytochrome b [Caenorhabditis | 
eleeans] 


4.1 


151 


U77846 


Human elastin gene, 
partial cds and partial 
3UTR 


hi 


3355682 


(AL03 1 124) putative secreted 
Ivasc 


4.0 


152 


X62880 


S scrofa mRNA for 
calcium release 
channel (CRC) 


1.7 


3327080 


(ABO 14533) K1AA0633 protein 
[Homo sapiens] 


4,0 


153 


YO0O67 


Human gene for 
neurofilament subunit 
M (NF-M) 


1.7 


. 479829 


heterogeneous ribonuclear 
particel protein homoiog - 
Caenorhabditis elegans 
similarity to RNA recognition 
motifs [Caenorhabditis elegans] 


3.9 



1*1- 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


>teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















154 


X68393 


D.melanogaster gene 
for Beta-tubulin, 
exons 1 and 2 


1,7 


2342682 


(AC000106) Contains similarity 
to Rattus AMP-activflted protein 
kinase (gbpC95577). 
f Arabidopsis thaliana] 


3,8 


155 


AB012284 


Shuttle vector 
pAUR123 gene for 
Autl-C, complete cds 


1.7 


417704 ! 


(ORF1A/1B)ICONTAJNS: 
RNA-DIRECTED RNA 
POLYMERASE : HEL1CASE; 
PROTEASE 1 


3.8 


156 


M96633 


Rattus norvegicus 
mitochondrial 
intermediate 
pepuuase \iviii j 
mRNA, complete cds. 


1.7 


2314209 


(AE0006I3) fL pylori predicted 
codine region HP 1054 


3.1 


157 


U49055 


Rattus norvegicus 

l_ I U-Dinuing OJtv-llKC 

protein rA8 mRNA^ 
complete cds 


1.7 


2497252 


INSULUf-nKEDKUWlH 
FACTOR BENDING PROTEIN 
4 (IGFBP-4) (IBP-4) (IGF- 
BINDING PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor-binding 
protein -4, IGFBP-4 [sheep, 
liver, Peptide, 237 aa] [Ovis 
aries] 


3,0 


158 


Y15907 


Mus muscuius mRNA 
for myc-intron- 
binding protein- 1 


17 


912776 


iduronate-2-sulfatase. IDS {EC 
3.1.6.13) Peptide Mutant, 550 
... 


3.0 


159 


U67600 


Methanococcus 
jannaschii section 142 
of 150 of the 
complete aenome 


1.7 


2982355 


(AF052252) fork head domain 
protein FKD9 [Danio rerio] 


3.0 


160 


AFO 13759 


Homo sapiens 
calumein (Calu) 
mRNA, complete cds 


1,7 


1 2982355 


(AF052252) fork head domain 
protein FKD9 [Danio rerio] 


2.9 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Netehbor (BlastX vs. Non^Redundam Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












Human uiRNA ptodnct 




161 


AF0629 15 


ArabidoDsis thaliana 
putative transcription 
factor (MYB90) 
mJRNA, complete cds 


LI 


■ 

3878065 


K1AA0077 (TR:Q 14997); 
cDNA EST yk243h8,5 comes 
from this gene; cDNA EST 
yle243h8.3 comes from this 
gene; cDNA EST yk359h4.5 
comes from this gene 
[Caenorhabditis elegans] 
>gi|3S80318|gnl|PID|el349839 
(Z81 133) Similarity to Human 
mRNA product KIAA0077 
(TR:Q 14997); cDN A EST 
yk243h&5 comes from this 
gene; cDNA EST yk243h8.3 
comes from this gene; cDNA 
EST yk359h4.S comes from this 
gene 


2.3 


162 


X87526 


DNA (chromosome 
3: clone NU003R) 


1,7 


3638957 


(AC004877) sco-spondin-mucw- 
Hke; similar to P98167 uncertain 
[Homo sapiens] 




163 


AC005573 


Homo sapiens 
chromosome 5, PAC 
clone 202e 13 


1.7 


2465540 


(AF005632) phosphodiesterase 
I/nucleotide pyrophosphatase 
beta [Homo sapiens] 


1.8 


164 


D83402 


Homo sapiens gene 
for prostacyclin 
synthase, exon 10 and 
complete cds 


1.7 


627608 


steroid hormone receptor TR3 - 
human sapiens] 


1.7 




AF053700 


Homo sapiens dekex 
(Dx) mRNA, 
complete cds 


1.7 


2662089 


(AB007864) K1AA0404 [Homo 
sapiens] 


1.7 


166 


AF043225 


Mas musculus 6- 
pyruvoyl- 
tetrahydropterin 
synthase (Pts) 
ImRNA. complete cds 


1.7 


2352538 


(AF006564) alcohol 
dehydrogenase [Drosophila 
pefsimilis] persimilisl 


1.4 



ISM 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs, Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















167 


< 

U52917 


Thelitis aquah^ 
thermophilus NADH 
dehydrogenase I 
subunits NQ07 
NQ06, NQ05, 
NQ04, NQ02, 
NQ0KNQO3, 
NQ08, NQ09. 
NQOlO.NQOll, 
NQ012,NQOl3,and 
NQ014, complete 
cds. 


1 7 




(AB006631) The human 
homolog of mouse Cux-2 
Homo saniensl 


1.0 


168 


X72222 


M.musculus gene for 
serotonin 2 receptor 


1.7 


3875796 


^/3426) Wlanly to Veast 
hypothetical YIK9 protein 
(S W: YIK9_YE AST); cDNA 
EST EMBL:T01252 comes 
from this gene; cDNA EST 
EMBL:D33205 comes from this 
gene; cDNA EST 
EMBL:D33955 comes from this 
gene; cDNA EST 
EMBL:D35484co„. 


1.0 


169 


U23186 


C rot a) us scutulatus 

PLA2-like 

pseudogene 


L7 


853971 


(XS3413) DR5 [Human 
herpesvirus 6] >gi|853972 
(X83413) DR5 [Human 
herpesvirus 61 


0,99 


170 


M83118 


JVluS mu5CUlU5 ioClUl 

Vlll-associated 
protein (f8a) mRNA, 
complete cds. 


1.7 


3201617 


(AC004669) hypothetical 
oroiein TArabtdopsis thaliana] 


0.80 


171 


M3S347 


E.coli ATP- 
dependent proteinase 
(Ion) gene, complete 

cds, 


1.7 


4140322 


OT031282> dl283li3 J J i (Clell 
Division Cycle 2-Like 2 
(P1TSLRE. p58/GTA* 
Galactosyltransferase 
Associated Protein Kinase)) 
(isoform beta 2-2) [Homo 

H^POTHHTIC^LTROLINE^ 


0J8 


172 


U28838 


Human transcription 
factor TFIIIB 90 kDa 
subunit 


1.7 


2495730 


RICH PROTEIN KIAA0269 
>gi|1665805[gnI|PlD|dl014089 
(D87459) Similar to Volbox 
carter! extensin (S22697) 
[Homo sapiens) 


0,62 



(55 
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SEQ 
ID 


Nearest 
ACCESSKtt 


Neighbor (B)astN vs. < 
1 DESCRIPTION 


Jen bank) 
P VALUE 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) ! 
ACCESSION 1 DESCRIPTION PVALTJF 


173 


U72487 


RaUus norvegicus 
calcium-independent 
aJpha-latrotoxin 
receptor mRNA, 
complete cds 


1.7 


544411 1 


GLYCOPROTEIN GP100 
PRECURSOR (P29F8) 
discoidcuml 


0.35 I 


174 


AE000718 


Aquifex acolicus 
section 50 of 109 of 
the complete genome 


1.7 


> 

2497569 


FIBROBLAST GROWTH 

FACTOR RECEPTOR 3 
1 PRECURSOR (FGFR-3) 

(HEPARIN-B INDING 

GROWTH FACTOR 
1 RECEPTOR) 

>gi|211785I|pir|p55363 
1 fibroblast growth factor receptor 
|3 - mouse >gi|199l45 (M81342) 
(fibroblast growth factor receptor 

3 (Mus muscuius] 


I 0.34 J 


175 


AF016897 


Oryza sativa GDP 
dissociation inhibitor 
protein OsGDI2 
(OsGDI2) rnRNA, 
complete cds 


1.7 


125362 


(MACRUFHAUL CULONTV 

STIMULATING FACTOR I 
RECEPTOR PRECURSOR 
(CSF^l-R)(FMSPROTO- 
ONCOGENE) (C-FMS) factor 1 
receptor * cat >gi| 163855 i 
(J03 149) M-CSF receptor [Felis 
domesticus] 


0.34 1 


176 


U95102 


Xcnopus laevis 
mitotic 

phosphoprotein 90 
rnRNA. complete cds 


1.7 


85058 


(muscarinic acetylcholine 
(receptor - fruit fly acetylcholine 
(receptor [Drosophila J 
|melano«aster] | 


0.20 


177 


i 
i 

i 

AF077352 


i amy ao mo nas 
reinhardtii myosin 
ieavy chain ! 


1.7 


728901 


ACROSOMAL PUUl'EIN SF- 
10 PRECURSOR SP-10- 
western baboon . 1 
l>gi|298488|bbs|127113 
(S56458) SP-10=mtraacrosomai 
protein [Papio papio=baboons, ( 
Peptide, 285 aa] [Papio 
hamadrvas] | 


0.20 


178 


c 
I 

Z92788 


^aenorhabditis 
ilegans cosmid 
7 53B8, complete 
cquence 
Caenorhabditis 
leeans] 


1.7 


746516 J 


(U23517)Di022.7 
[Caenorhabditis elegans] 
>2il3258651 elesans] | 


0.068 1 
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Nearest Npi Oh hnr j'Rlf»ctM uc r^nhanL^ 




SEQ 
ID 


ACCESSION 


f DESCRIPTION 


p value 


ACCESSION 


w \ diosia vs. Non-Redundant P 
DESCRIPTION 


toteins) 
_ P VALUE 


179 


AF002217 


Raistonia eutropha 
megaplasmid pHG 1 
nitric oxide reductase 
(norB) gene, 
complete cds 


1.7 


1143538 


(xk7883) mitocnondnai capsuT 
selenoprotein [Rattus 
norvegicusJ>gi|1354l35 
(U48702) mitochondia 
associated cysteine-rich protein 
SMCP 


_ 0.039 


180 


U3U749 


Rat mRNA for 
protein tyrosine 
phosphatase 


1.7 • 


1228035 


(D83776) The KIAA0191 gene 
is expressed ubiquitously.; The 
KIAA0191 protein retains the 
C2H2anc-fingeratitsN- 
terminal region. [Homo sapiens 


0:008 


181 


Ml 5202 


Rat fast skeletal TnT 
gene encoding 
troponin T isoforms, 
complete cds. 


1.7 


731172 


SKIN SECRETORY PROTEIN 
XP2 PRECURSOR 


4e-04 


182 


L07592 


Human peroxisome 
proliferaior activated 
receptor mRNA, 
complete cds. 


1.7 


4033414 


PUTATIVE IMPORTIN BETA- 
4 SUBUNIT 


2e-06 ! 


183 


U64031 


}endrobium 
crumenatum ACC 
synthase gene, 
complete cds 


1.7 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
Bacillus subtilis] 


2e-ll 


184 


AF034970 


Homo sapiens 
docking protein 
(DOK-2) mRNA, 
:ompI«e cds 


1.7 


2289097 


(U78737) 

aJpha( i,3)nicosyltransferase 
[Cricetulus griseus] 


8c- 12 


185 


t 

( 
! 
< 

212839 c 


L.iongitiorum mKiNA " 
sncoding calmodulin. 

> :: 

3fb(L18912|LILCALM 
3DULi!ium 
ongiflorum 
almodulin mRNA, 
omplete cds. 


1.7 


2511747 t 


AJF023270) probable 
ranscriptional regulator drc4 


4e-12 
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m Nearest t S fr, (BlastN v, Gcnb anJQ j Ne a rest Nei g hbo r (BlastX vs. No,^^..^ 



SEQ 

CP I ACCESSION 



186 



X53459 



187 



K02668 



188 | AB008375 



189 1 L366Q3 



DESCRIPTION | P VALUE I ACCESSION 



Equine arteritis vims 
(EAV) RNA genome 

> :; 

emb|A45589|A45589 | 
Sequence 1 from 
Patent W09519438>| 

emb(A58849|A58849 | 
Sequence I from 
Patent WO9700963>| 

gb|AR0l3959|AROi3| 
959 Sequence 1 from f 
patent US S77323S 



1.7 



3979817 



DESCRIPTION 



(£70633) Weak^uuil^aj, lu 



coU ddl gene 
encoding D-aIanine:I 
alanine ligase and 
ftsQ and ftsA genes, 
complete cds, and 
ftsZ gene, 5' end. 



1.7 



3879121 



Homo sapiens mRNA| 
for osteoblast specific | 
cysteine-rich protein, 
co mplete cds 



1.7 



2496945 



Pscudomonas cepacia | 
(clone Psudom70-1) 
heat shock protein 70 1 
(hsp70) gene, 
mplete cds 



2661842 



Human tyrosine-protein kinase 
CSK (SW:CSK^HUMAN); 
cDNA EST EMBL:C 10908 
comes from this gene; cDNA 
EST EMBLrC 12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
gene; cDNA EST yk408c2.5 ... 
Human ryrosine-protein kinase 
CSK <SW:CSK_HUMAN); 
cDNA EST EMBL:C 10908 
comes from this gene; cDNA 
EST EMBL.CI2822 comes 
from this gene; cDNA EST 
yk408c2 3 comes from this 
gene; cDNA EST yk408c2.5 ... 



(2703 1 0) predicted using 
Gencfinder; Similarity to Mouse 
ankyrin (PIR Acc. No. S3777I) 
cDNA EST EMBL:T0I923 
comes from this gene; cDNA 
ESTEMBL;D32335 comes 
from this gene; cDNA EST 
EMBL;D32723 comes from this 
gene; cDNA ES.„ Genefinder; 
Similarity to Mouse ankyrin 
(PIR Acc. No. S37771); cDNA 
EST EMBL:T0I923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this 
gene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES.., 



Ie-14 



HYPOTHETICAL 55,9 KD 
PROTEIN EEED8.6 IN 
CHROMOSOME n>gi|733603 
(U23484) No definition line 
found [Caenorhabditis elegans] 



(Y 1 5732) DNA polymerase beta 
jXenopus laevisl 



2e-19. 



le-19 



6e-20 



(st 
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SEC 
ID 


t Nearest 

> 

_j A^^JEjaii,jf 


Neighbor (BlastN vs. 

j rMCC/* , Diwir»fcf 
h Db5CKlFTlON 


Gcnbank) I 
P VALUE | 


Nearest Neighbor (B lastX vs. Non-Redundant 1 

ACCESSION | DESCRIPTION 
Uuuii- 1 


Steins) 1 
PYALUEl 


190 


249760 


P.blaJccslceanus 
mRNAGTP 
cyclohvdrolase I 


ij J 


1731181 


1 JTKUTCIN C 1 4A4.3 iN 
CHROMOSOME H 
>gi|3874230|gnJ|PID(el3516l8 
protein (Swiss Prot accession 
number P38376); cDNA EST 
Jyk220el0.5 comes from this 
gene [Caenorhabditis clcgans] 


3e-21 I 


191 


U52428 


Human faity acid 
synthase gene, partial 
cds 


hi J 


4226073 


l(AF 125443) contains similarity 
Jto S. pombe phosphatidyl 
synthase (GB:Z28295) 
([Caenorhabditis elegansl 


6e-25 J 


_192 


U12767 


Human mitogen 
induced nuclear 
orphan receptor 


L6 I 


<NONE> 


<NONE> 


<NONE>| 


193 


Z63478 


Rsapiens CpG DNA, 
clone S5al2. forward 
readcpg85a!2itla. 


1.6 1 


<NONE> 


1 <NONE> 


<NONE> I 


194 


AK)84375 


Homo sapiens | 
irtversin protein, 
exons 8 and 9 


1.6 I 


<NONE> 


<NONE> 




195 


AE001114 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.6 1 


<NONE> 


<NONE> 


<NONE>| 
<NQNE> j 


196 


i 

ARJ84375 t 


Homo sapiens 
nvcrsin protein, 
:xons 8 and 9 


1.6 


<NONE> 


<NONE> 


<NONE>| 


197 


} 
I 
1 

U24217 p 


Cluyveromyces lactis 
IN A polymerase II 
argest subunit gene, 
artial cds 


1.6 1 


<NQNE> 


<NONE> , 


cNONE>| 


198 


f 

2 

1 

AE000580 p 


ielicobacter pylori 
6695 section 58 of 
34 of the complete 
enome 


16 1 


<NONE> J 


<NONE> |< 


:NONE> 
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mm 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



199 



200 



201 



202 



203 



X62083 



DESCRIPTION 



M28064 



U03114 



U88422 



M68519 



H.sapiens mRNA for 
Drosophila female 
sterile homeotic 
(FSH) homologue > 
gb|M80613|HUMFS 
HG Human homolog 
of Drosophila female 
sterile homeotic 

mRNA^ complete cds, 
Plasmodium 
brasilianum DNA 
homologous to the 
histidine-rich knob 
protein region of 
Plasmodium 
falciparum. 
Itreptomyces 
lipase precursor (lip) 
gene, complete cds 
and unidentified 5' 
ORF and 3' ORI\ 
iial cds . 

Strix varia oocyte 
maturation factor 
Mos (c-mos) proto- 
oncogene .partial c ds 



Human pulmonary 
surfactant-associated 
protein SP-A 
(SFTP1) gene, 
complete cds. 



P VALUE 



Nearest Neig hbor (Blast* vs. Non-Redundant P^i^T 
ACCESSION | DESCRIPTION 



L6 



<NONE> 



P VALUE! 



<NONE> 



1 <NQN£> 1 



1.6 



457495 



1.6 



1.6 



3638957 



137618 



(M26647)ORFX 
Saccharomyces cerevisiae] 



1.6 



3875423 



(Z38112) E03A3.6 
[Caenorhabditis elegansl 



8.4 



(AC004877) sccHspondin-mucinJ 
like; similar to P98167 uncertain] 
Homo sapiens] | 73 



VITAMIN D3 RECEPTOR 
(VDR) receptor [Rattus 
norvegicusl I $4 



204 



AF044575 



Homo sapiens 
transcription factor 
POU4F3 



205 



L4S476 



iomo sapiens 
(subclone 3_el0 from 
P1H21)DNA 
sequence. 



16 



2133625 



GAB A transport protein - 
tobacco horn worm 



1,6 



3687297 



206 



Ml 8630 



Rat CNS 2\3'-cyclic 
nucleotide 3* 
phosphodiesterase 



3880315 



(AJ005588) 5-epi-aristolochene | 
synthase 



(Z8 1 1 33) Similarity to Human 
mRNA product KIAA0077 
(TR;Q14997) [Caenorhabditis 
elegansj 



4.9 



4.7 



4.6 



(Go 
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SEC 
{ ID 

207 


-\ Nearest 
A 

JACCESSIOr 


Neighbor (BlastN vs. 
4 DESCRIPTION 


Genbank) 
| PVALUI 


4 Ncargst Nei * bj* WastX vs. Non-RedunH™, Pr^ } " ' 
LI ACCESSION 1 DESCRrmnM L„^ T 


1 AF027174 


ArnHirirtrtcic thilim' 
ruduiuu^is u) ill I An t 

cellulose synthase 
catalytic subunit (At 
B) rnRNA, complete 
cds 


ij 
hJ 
! 1.6 


1 267068 


TUMOR-ASSOCIATED 
ANTIGEN L6 


3.6 1 


208 


I U53448 


odoc^ia microii neat 
shock protein 70 
(hsp70) gene, 
complete cds 


! ' 16 


1255429 


KU53155) strong similarity to 
the carboxyl two-thirds Of valyl 

JtRNA synthetases 
rCaenorhabdihc Hroonei 


1 2.2 J 


1 209 


AF084367 


Homo sapiens 
invcrsin protein 
mRNA, complete cds 


I 16 


| PROBABLE 
1 SERINE/THREONINE- 
1 PROTEIN KINASE CY49.28 
, >gi|!370255|gnl(PID|e247094 
1730076 (Z73966)pknJ 


1 12 1 


210 i 


i J i east cis I + gene for 
p93disl. complete 
D55635 cds 


1 1-6 


| j(AF010496) maltose transport 
3 128353 linner membrane nm»in 




I 211 


Strcptomyces sp. 2- 
ldehydro-3- 
deoxyphosphohepton 
[ate aldolase gene, 
AF035756 [partial cds 


1.6 


853971 


(X83413) DR5 (Human 
herpesvirus 6] >gi|853972 
(XS3413) DR5 [Human 
heroesvinK 61 


12 1 

0.97 J 


1 212 1 


H 

X73479 It 


J.cuniculus rPTPA 
nKNA J 


1.6 1 


( Y 17034) Bassoon [Mus 1 
3413810 ImtKTtfhicl 1 


0.94 J 


213 1 

L214 1 


\l 

_X9833Q r 
P 

X64194 U 


I.sapicns mRNA for J 
yanodine receptor 2[ 


1.6 


2072986 


(U95142) putative G-protein- 
coupled receptor G-protein- I 
coupled receptor [Arabidopsts 1 
thaliana] 1 


0.73 1 


-anserinaFMRl J 
enc exons 1 and 2 I 


1.6 1 


128014 | 


NECDIN >gi|9H29|pir|(jN0l48 f 
needtn. brain - mouse 1 
>gi|200020(M80840)necdin 
Mus musculus] 


0 A*> 1 


215 I 


cl 
IF. 
se 
[C 

_ Z92788 U 


aenorhabditis I 
egans cosrnid 
53B8, complete J 
quence 
Aenorhabditis 
ggans] 1 


16 I 


( 
[ 

746516 > 


U23517)D1022.7 
Caenorhabditis elegans] | 
gi[3258651 elegans] 


019 1 


216 1 i 


[Mcthanobactcriuni 1 
Ithermoautotrophicum 
[from bases 1098908 1 
to 1112186 (section 
94 of 148) of the 
\E000888 (complete senome 


1.6 


D 
R 
P 

R 
in 

462415 tx 


MTERFERON- ALPHA/BETA 1 
ECEPTOR ALPHA CHAIN 1 
RECURSOR (IFN-ALPHA- 
EC) >gi|346520jpir||S27387 
terferon alpha receptor type 1 J 
jvine >sij432 


0,001 I 
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Nearest Neighbor fBlastN vs. Genbank) 


Nearest Neighbor YBIastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 














217 


AB0O8375 


Homo sapiens mRNA 
for osteoblast specific 
cysteine-rich protein, 
complete cds 


1.6 


2496945 


HYPOTHETICAL 55.9 KD 
PROTEIN EEED8.6 IN 
CHROMOSOME II >gi|733603 
(U23484) No definition line 
found [Caenorhabditis eiegans] 


le-18 


218 


M25312 


Orang-utan involucrin 
gene* complete cds. 


L6 


3875131 


(Z70750) similar to vanadate 
resistance protein 
iransmembranous domains 
[Caenorhabditis eiegans] 


3e-26 


219 


AB012882 


Cvurinus caroio 
mRNA for MyoD. 
complete cds 


L5 


<NONE> 


<NONE> 


<NONE> 


220 


U29487 


Caenorhabditis 
elcgans cosmid 
C09C7 


1,5 


<NONE> 


<NONE> 


<iNONE> 


221 




M.musculus mRNA 

fnr Nnfrh i 
lor nuivii j 


1.5 


1364094 


integral membrane protein - 
Streptomyces pristinacspiralis 
>gi|872306 (X84072) integral 
membrane protein 
[Streptomyces pristinacspiralis] 


4.3 


222 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSPl7,6 
mRNA* complete cds 


1.5 


121855 


BCUiGLUCAWLStU 
PRECURSOR cellulose 1,4-bcta 
cellobiosidase (EC 3.2.1.91) n 
precursor - fungus (Trichodcrma 
reesei) 1 ,4- be ta-cel lobiosidasc 
(EC 3.2.1.91) II -fungus 
cellobiohydrolase II 
[Trichoderrna rccsei] 


4.3 


223 


U42391 


Human myosin-DCb 
mRNA* complete cds 


1.5 


368S428 


(AJ01 1534) sucrose synthase 


4.2 


224 


M92296 


Pongo pygmaeus 
gamma- i and gamma- 
2 globin genes, 
complex cds. 


1.5 


186413 


(M13l44)inhibinA[Homo 
sapiens] 


0.22 


225 


X94144 


C.japonica mRNA for 
QNR-71 protein 


1.5 


2745737 


(AF029791) UDP- 
Gal:bctaGlcNAc beta U- 
galactosyltranferase-II [Mus 
musculus] 


3e-08 
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[sec 

1 CD 


u 

jAccEssior 


jNeipnDor (BiastN vs. Gcnbank) 
*J DESCRIPTION IPVALU* 


Nearest Neigl 
2 ACCESSION 


itjor(BlaitX vs. Non-Redundant FWinO ~ 
DESCRIPTION p vat iro 


226 


1 

I AB014557 


Homo sapiens mRN/ 
for KIAA0657 
protein* partial cds 


J 

1.5 


• 

1212992 


(A u 0.!>b8) Protein sequence arte 
annotation available soon via 
Swiss-Prot; available at present 
via e-mail from 
LABETT@EMBL- 
_ Heidelberg.DE [Homo sapiens) 


1 


227 


1 AF000948 


Borrclia burgdorferi 
Oligopeptide 
permease homolog 
OppAIV (oppAJV) 
gene* complete cds 


1 13 


<nqne> 


<NONE> 


1 4e-13 J 
|<NONE>| 


228 


|_AF057287 


Mus musculus 

RAR/Rir> nrnrrin 

mRNA, partial cds 


1.3 


2498005 


MYC PROTO-ONCOGENE " 
PROTEIN (C-MYC) proto- 
oncogene fSus scrofa] 


2.6 I 


229 1 


U38951 


DrosoDhiia 
melanogastcr 
vacuolar ATPase 
subunit E 


i i 
1,1 


<NONE> 


<NONE* 




| 230 I 


AF027148 


Homo sapiens 
myogenic 

determining factor 3 


1,1 


3172134 


RNA polymerase H | 
largest subunit [Bonnemaisonia 

hamifera] 


<NONE>| 
2.3 J 


1 231 


AF079310 


Mus musculus histone 
deacetylase 3 
(Hdac3) gene* exons 
4 through 15 and 1 
complete cds t 


1,0 


1657601 


(U66220) unknown ( 
'Nannocystis exedens] 1 


0,25 I 


1 232 


J 

X52I34 1 


P.radiata lac gene for I 
accase 1 


0.95 


i 

996020 | 


X9 1638) BRM protein [Gallus 
gallus) J 




I 233 1 


1 
I 

D89016 c 


■luman mRNA for I 
Neuroblastoma, I 
nmplete cds J 


0.93 


<NONE> 


<NONE> 


0.31 J 
cN0NE>| 


I 234 1 


( 

( 
n 

X76392 3 


:.famiharis VIP36 J 
vesicular integral- 1 
fiembrane protein of 1 
6 kOa) mRNA | 


0.93 


( 
( 

r, 

4176446 h 


AL022238)dll042K10,2J | 
novel protein with probable 
ibGAP domains and Src 1 
omology domain 3) j 


7e-81 1 


L235 1 


N 
P 

AF100694 c 


ins musculus 1 
omiro2 mRNA, 
3mplete cds j 


0.90 


<NONE> 


<NONE> [< 


:NONE>| 



WO 01/02568 



PCT/USOO/18374 



jaggl Ne arest Neighbor (BlastN vs. Genbank) 

seq[ ^ 



id I accession! description 



236 I AE000991 



237 I 235922 



jArchaeoglobus 
Iruigidus section 1 16 

of 172 of the 
[ complete gen ome 

S.ccrevisiac 
(chromosome Q 
(reading frame ORF 
JYBR053C 



Nearest iNeignoor (BiastX vs. Non-Redundant Protein*? 



1V ALUE 1 ACCESS TOM 



238 



Ratrus norvegicus 
[metabotropic 
jgJutamate receptor 4b 
U4733 1 mRNA, complete cds 



_239 1 X72810 



JH.sapiens Ig germlinc 
jkappa-chain gene 
variable region (L3) 



241 | U71597 



{Escherichia colt 

jgenes faeG, faeH, 

fael.faeJ and IS629- 

jlike insertion 

(sequence. > :: 

emb|ZH7l0|ECFAE 

HU E.coli faeH, fael 

and faeJ genes 

encoding FaeH. Fael 

la nd FaeJ protei ns 
hrynosoma 

Idouglassii NADH 

jdehydrogenase 

jsubunii 4 (ND4) 

gene, mitochondrial 

gene encoding 

mitochondrial 

Jprotcin, partial cds_ 



0.00 



0.86 



1176579 



<NONE> 



0-82 



1550703 



0.69 



3Q23063 



0.69 



2347188 



0,65 



<NONE> 



DESCRIPTION 



(kARLym noaqsempm 

>gi|1362345|pir||S55862 
probable membrane protein 
YNL327w - yeast 

(Saccharomyces ccrevisiae) 
cerevisiacj 

>gi|1302445|gnl|PID(e239572 
(Z71603) ORF YNL327w 
Saccharomyces cerevisiael 



LP VALUE! 



6,9 



<NONE> 



(280225) hypothetical protein 
Rv2662 



4.1 



(AF052587) F14 [Xylella 
fasridiosa] 



6.7 



(AC002338) laccase isolog 
'Arabidopsis thalianal thalianal 



3.9 



<NONE> 



!<NONE> 
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pfel Nearest Neighbor (BlastN vs. GcnbanlQ 



SEQ 

10 I ACCESSION I DESCRIPTION 



Nearest Neignbor fBlastX vs. Non-Redundant ProteinsT 



P VALUE ACCESSION 



243 | D25542 



[Ammonia species 
LSUrRNA gene 
(partial; isolate Tr S 
]5; clone 16 ) | 0.64 

jHuraan mRNA for 
golgi antigen gcp372, 
[complete cds | o.64 



1174506 



U1230 



jCow dopamine 
I transporter mRNA, 
244 1 M80234 putative cds. [ 0.64 



3874972 



245 I AB007918 



Homo sapiens mRNA| 
{for KIAA0449 



protein, pi 
Human U 



in, partial cds 



247 



J266 

[rearranged DNA for 
llambda- 

jimmunoglobulin light | 
[chain 

[Helicobacter pylori, 
strain J99 section U5| 
of 132 of the 
AE0Q1554 [complete genome 



0,64 



2833239 



0.63 



0.62 



Ksapiens CpG DNA, 
clone 96e7, reverse 
248 I £64067 read cpfr96e7.rtla . 



TPinus sylvestris 
Imicrosatellite DNA, 
249 | AJ223768 Iclone SPACi M 



0.62 



2072301 



<NONE> 



<NONE> 



DESCRIPTION 



0.62 



<NONE> 



SYNTHETASE glutamate- " 
tRNAIigase (EC 6.LU7)- 
Haemophilus influenzae (strain 
Rd KW20) >gi|1573240 
(U32713) glutamyl-tRNA 
synthetase (gltX) (Haemophilus 
influenzae Rdl 



ultra-high-sulfur keratin 1 
mouse 

(Z99709) similar to Elongation 
factor Tu family (contains 
ATP/GTP binding P-loop); 
cDNA EST EMBL.D76223 
comes from this gene; cDNA 
EST yk478c5.5 comes from this | 



1.2 



IALGR0WT 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
Substrate [Homo sapiens] 



8c-06 



2e^l4 



(U95102) mitotic 
phosphoprotein 90 [Xenopus 
laevis) 



<NONE> 



<NONE> 



1.5 



<NONE> 



l<NONE> 



1 <NONE> 
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SEQ 
ED 



250 



251 



252 



253 



Nearest Neighbor (BtastN vs. GcnbanJc) 



Nearest Neighbo r (BlastX vs. Non^Kcdund^itP rni^TT 



ACCESSIONI DESCRIPTION PVAUJF ArrrrccmK, 



254 



255 



256 



257 



258 



259 



Bacteriophage Pi banl 
AJ011592 ( gene 

venopus laevis 



0.62 



2493689 



survival of motor 
neuron protein 
interacting protein J 
J(SIPi) mRNA,. 
AF027151 complete cds 



0,62 



Helobdella triserialis 
AJ0003 76 ImRNAfo ractin [ 0.62 



4007790 



DESCRIPTION 



Priui'USYiiTtMlIlOKD" 
PHOSPHOPROTEIN deltoides 
>gi|2l43326|gnl|PID|e319090 
(Y 13328) lOkDa 
phosphoprotein [Pbpulus 
deltoides] 



( AL034463) putative single- 
strand polynucleotide binding 
protein fSchizosaccharomyces 
pombe] 



iRat thymosin beta 4 
M6923 1 fecne (pTB4G) Jntron. 



0,62 



|Homo sapiens X11L2| 
rnRNAforXll-Hke 
(protein 2, complete 

AB021638 [cds j_ q.61 

jBacteroides ~~ 



1117968 



4176370 



|(U40763) CARS-Cyp [Homo 

pi ens] sapiens! 
I(AUJU5U38) similar to calcium 
[independent phospholipase A2; 
similar to AC004392 
(PED:g3367519) [Homo 
Jsapiens] 



P VALUEl 



<NONE> 



<NONE> 



D26470 
J04737 



U06756 



S75756 



gingivalis DNA for 
larginyl 

lendopcpudase, 
complete cds | 0.61 

|A.thaliana ATPase 
gene, complete cds, | 0,61 
IBos taurus clone 
bml308 

microsatellite and areJ 
|lp repeat regio n. 0.61 

pl5=cyclin D- 
dependent kinases 4 
land 6- binding 
protein/pl5 product 
(exon/intron 1 } 
[human, brain tumors, 
[Genomic, 753 ml | 0.61 



<NONE> 
<NONB> 

1922280 



<NQNE> 
<NONE> 



|(Y09905) snail like protein 
JGallus gallus] 



2.0 



0.90 



6e-5I 



L39837 



DrosophiTa 

melanogaster tumor 
jsupressor (warts) 
I mRNA exons 1-8, 
[complete cds- 



484938 



hypothetical protein 253 - 
Streptomyccs grlseus plasmid 
pSGI (fragment) 



0.61 



J875131 



(270750) similar to vanadate 
resistance protein 
transmembranous domains 
[Caenorhabditis clcgans) 



<NQNE> I 
<NONE>| 

0.51 



0.13 



le-09 



WO 01/02568 



PCT/USOO/18374 



SEC 
1 ED 


£ Nearest 


[Neighbor (BlastN v Si 
M DESCRIPTION 


Gcnbank) 
PVALUI 


I Nearest Neighb 
U ACCESSION 


>r (BlastX vs. Non-Redundant I 
DESCRIPTION 


'roieins) 1 
PVALUEj 


| 


U52428 


Human fatty acid 
synthase gene* partia 
cds 


I 

0.61 


1 4226073 


|(Af 125443) contains simiiaiiry 
[to S. pombe phosphatidyl 
synthase (GB:Z282^ 
ICaenorhabditis ele^ans] 


2e-26 f 


261 


X15292 


Plasmodium 
falciparum gene for 
heat-shock protein 
pPf203 


0.60 


1 <NONE> J 


<NONE> 


<NONE> I 


1 262 


_ AB 020663 


Homo sapiens mRNA 
for KIAA0856 
protein, panial cds 


0.60 


I 

J 470341 1 


(U00043) No definition line 
found rCaenorhahditift fUannei 


5.7 1 


263 


U68723 


Human checkpoint 
suppressor 1 mRNA, 
complete cds 


0,60 


544375 


PROTEIN REGULATOR 

Iglucose/galactose binding 

[protein regulator- 

[Agrobacterium tumefaciens 
>gi|142228(L10424) 
glucose/gaiactose binding 

[protein regulator 


57 1 


L 264 


M32687 


S.griseus sporulation 
protein genes 1590 
and 1422. 


0,60 ] 


2582017 


<AF01287L)Mergla'[Mus 
musculus] 


3.3 1 


1 265 


AJ00533L 


Homo sapiens 
NKCC2 gene, exon 4, 
isoform B 


0.60 I 


3128353 


(AF010496) maltose transport J 
inner membrane protein f 


1.5 1 


266 


] 
1 

U14103 < 

■ 


Mus musculus RGL 
>rotcin mRNA, 
complete cds. 


0.60 


4099845 


(U90533) serine protease J 
[inhibitor fiStreptomyces fradiae] 


0.098 1 


[267 


I 
I 

U95094 c 


^Cenopus laevis XL- 
NCENP(XL- 
NCENP) mRNA, 
ompletecds 


0.59 1 


3282851 


[(AF047897) ankyrin-Hke protein 
HGE-ANK [Ehrlichia sp. BPS] 


5.5 1 
4.3 1 


268 


ti 
f 
9 

i 

AE000872 i 


/lethanobacterium 
lermoautotrophicum 
rom bases 896604 to 
12784 (section 78 of 
48) of the complete 
enomc 


0.59 1 


401553 


HYPOTHETICAL 24 J KD 
PROTEIN IN NADB-SRMB 
INTERGEN1C REGION | 



!<f7 



WO 01/02568 PCT/US00/18374 



JP 1 ACCESSION DESCRIPTION | p VALUE 



269 



iGallus gallus achaete- 
scute homologue 
j(ASH) rnRNA, 
LI1871 jcompletc cds. 



0,59 



JOryctoIagus 
Jcuniculus glycogen 
synthase mRNA, 
J70J AFQ17II4 complete cds 



ACCESSION 



DESCRIPTION 



628110 



j [Human herpesvirus 4] 2 
I [Human herpesvirus 4] 
>gi|I334S38|gnl(PID|e25079 4 
([Human herpesvirus 4] 

>gi|l334840|gnI(Pff>|e25081 6 
([Human herpesvirus 4] 

>gi|I334842|gnl(PlD|c25067 8 
I [Human herpesvirus 4] 

>gi|1334844|gnI|PID|e25069 10 | 
([Human, herpes vims 4] 

>gi|1334846|gnJ(PID|e2507l 12 
([Human herpesvirus 4] 



0.59 



271 j AF027807 



272 | U81787 



273 \ IJ76036 



(Homo sapiens beta- 
(casein (CSN2) gene, 
[complete cds 

{Human WntiOB 
I mRNA, complete c ds | 
f Apteryx australis 
Jribosomaj RNA gene, | 
(mitochondrial gene 
I for mitochondrial 
IRNA, partial 
J sequence 



0.59 



0.59 



iHomo sapiens rnRNA | 
JforKIAA0664 
274 J AB014 564 protein, partial ah 



AF044171 



(Homo sapiens cycl in- 
dependent kinase 
(inhibitor 2D 
|(CDKN2D) gene, 
partial cds 



0.59 



728856 



((AF0671 55) truncated rev 
(protein [Human 
_??5 2932 [immunodefic iency virus type 1 ] 

(267990) similar to cuticle 
J875538 (collagen 



1709851 



3925213 



(AF055088) ATP-binding 
cassette; PsaB [Streptococcus 
^pneumoniae] 

rPlB-AiJSoCUTED SPUCltiC 



IpvalueI 



4.2 



NITROGENASE IRON-IRON 
PROTEIN ALPHA CHAIN 
I (NITROGENASE 
COMPONENT D 
(DINITROGENASE) capsulatusl 
>gi|312238(X70033) 
alternative nitrogenase | 2 4 



1.5 



L4 



0.83 



FACTOR (PSF) long form - 
(human >gi|38458 (X70944) 
(PTE-associated splicing factor 

[Homo sapiens] 



0.17 



(AL032626) Y37D8A.17 
1[Caenorhabditis elee 



3e-10 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



276 



277 



278 



279 



280 



NcansTl^hbor (BlastN vs. Gen bank) 



ACCESSION I DESCRIPTION | P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Frc^teTnT) 



ACCESSION 



ISaccharomyces 
cerevistae cdc2/cdc28| 
related protein kinase [ 
H9640 gene, complete cds. j 0,59 
Human DNA 



388011$ 



DESCRIPTION 



sequence from 
cosmidE140G5 on 
chromosome 22. 
complete sequence 
Z80999 [Homo sapiens] 



ttsapiens WNT8B 
YU108 gene 



0.58 



<NONE> 



0.58 



ISphyraena idiastes 
I lactate dehydrogenase! 

U80001 |A ] o,58 

.cerevisiae 
jchromosome X 
reading frame ORF 
249637 |YJR137c | 0.58 



<NONE> 



2SI 



282 



283 



284 



285 



286 



287 



288 



H.sapiens ALAD 
gene for 

[porphobilinogen 
X64467 synthase 



<NONE> 



<NONE> 



(Z8U30) T23G11.9 
[Caenorhabditis elcqansl 



_<NONE> 



<NONE> 



<NONE> 



<NONE> 



G.gallus hox B3 
X74506 ImRNA 



0.58 



<NONE> 



ICochliobolus ~~ 
Iheterostrophus 
U68040 polyketide synthase 



0.58 



<NON|> 



0.58 



<NONE> 



Arabidopsis thaliana 
putative auxin efflux 
carrier protein (PINi)j 
AF089Q84 ImRNA. complete cds j 
lRattus norvegicus 
ROK-alpha mRNA, 
U38481 complete cds 



0.58 



0.58 



<NONE> 



<NONE> 



Homo sapiens G 
protein beta 5 subunit 
AFO 17656 mRNA, complete cds 



M96234 



AB0Q2339 



Human glutathione 
transferase class mu 
number 4 



0.58 



Human mRNA for 
KIAA0341 gene, 
partiai cds 



0.58 



0.58 



3236249 



1280073 



861293 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NQNE> 



(AC0046S4) hypothetical 
protein [Arabidopsis thaliana] 



(U55366) Similar to cuticle 
collagen [Caenorhabditis 
elcgans] 



(U28741)F35D2.1gcne 
product [Caenorhabditis 
elegans] 



l«S 



le-21 



<NONE>| 



<NONE>| 



L<NONE>| 



[<NQNE>| 



l<NONE>l 



[<NONE>| 



|<NONE> 



9.2 



7.1 



7,1 



WO 01/02568 



PCT/USOO/18374 



Nearest Neighbor (BlastN vs. Gcnbanic) 




289 | UH295 



290 I D800Q1 



291 1 ZU700 



292_l M77350 



293 1 X63787 



DESCRIPTION 



Neisseria 
gonorrhoeae 
carbamoyl phosphate 
synthetase 
(glutamine) small 
subunit (carA) and 
large subunit (carB) 
h complete cd s. 
Human mRNA for 
KIAA0179 gene, 
partial cds 



P VALUE 



Nearest Neighbor (BlastX vs. Non~Rcdundant Protei^T 



ACCESSION 



Escherichia coli 
genes facG, facH, 
fael. faeJ and IS629- 
like insertion 
sequence. > :: 
emb(2li7l0|ECFAE 
KLJ Ecoli faeH, fael 
and faeJ genes 
encoding FacH, Fael 
and FaeJ proteins 



294 1 D63881 



295 I U39378 



296 1 X87987 



Human mRNA for 
KIAA0160gene, 
ial cds 



parti 3 
Uymi 



rymnocarena 
mexicana 16S 
ribosomaJ RNA gene, 
mitochondrial gene 
encoding 

mitochondrial RNA, 
partial sequence 



P.pastoris PRCl gene| 
> 

dbj|Ei2l03|EI2l03 
DNA encoding 
precursor of protease 
from Pichia pastoris 



0.58 



0.58 



2425135 



,4097223 



, DESCRIPTION 



pvalueI 



(AF020283)DG2044gene 
product fDictyostelium 
discoideum] 

U49836) gamma-gJutamyl 
transpeptidase precursor [Brugial 
maJayi] 



0.58 



Mouse hair keratin 
A I (MHKAI)gene, 

complete cds, | 0,58 

T.thermophila gene 

forsnRNA V3-2 | 0 M 



2347188 



(AC002338) laccase isolog 
"Arabldopsis thalianal thalianal 



141165 
2826900 



0.58 



1934730 



0.58 



2194131 



HYPOTHETICAL 8.3 KD 

PROTEIN >gi|62l79 

(AB004461) DNA polymerase 
alpha catalytic subunit [Oryza 
sativa) 



(U95036) gcrmin^likc protein 
(Arabidopsis thalianal 



(AC002062) Similar to 
Synechocystis antiviral protein 



0,58 



3914197 



OCCLUDIN >gijl276983 
(U4922l)occludin [Canis 
familiarisl 

>gi| 1 589 1 8 1 /prfl|22 1 0347D 
occludin [Canis familiaris] 



5.3 



4.1 



3.2 



3.2 



3.1 



3.1 



3,1 



3,1 



WO 01/02568 



PCT/US00/I8374 



SEQ 
[ ID 


1 Nearest Neighbor (BlastN vs, Genbank) 
[ACCESSION! DESCRIPTION 1 p VALUE 


1 Nearest Neighbor (BlastX vs. Non-Redundant ProtemT) 

L ACCESSION 1 DESCRIPTION |p vat tip 


' 297 


| |A.thaliana (L.Hcynh.) 
1 chJoroplast mRNA 

jfor recombinant APS- 
X^S782 kinase J 

(Mouse plaieler- 


0.58 


1732444 


(D38529) DRPLA protein 
[Homo sapiens] 


_ 2.4 I 



B chain muscuJus 
platelet-derived 
growth factor beta- 
chain (sis) gene, exon I 

298 J M64848 |5. | rj.58 

Helicobacter pylori, 
Jstrain J99 section 21 
of 132 of the 
AEQ01460 complete genome I 0.58 



299 



300 



iM.musculus gene for 
Iprotein kinase C- 
Igamma (exonl and 
X65720 exon 2\ 



301 



0.58 



lArabidopsis thaliana 

AF043130 lactate dehydrogenase! 0,58 
I Human genes for 
jcollagen type IV 
alpha 5 and 6, exon I 

_D28H6 Jandexonr I 0.58 



3025832 



2827198 



418395 



(AF055985) pyrrolidone-rich 
| antigen [Onchocerca volvulus] 



KAF037454) ubiquitin protein 



lligasc fMus musculu sl 

"LkjiVkoibiN — L " 

>gi|320737|pirj|S30818 
hypothetical protein YERl64w -J 
lyeast (Saccharomyces 
jcerevisiae) >gi|603404 
j(U18917) Chdlp: transcriptional} 
regulator [Saccharomyces 
Jcerevisiae] 



1.4 



1.1 



3024637 



jSEX-DETERMlNlNG 
REGION Y PROTEIN 
[determining protein [Mus 



1.1 



302 



303 



JArchacoglobus 
fulgidus section 32 of | 
172 of the complete 
^AE001075 Ecnome 



1458250 



(U64835) T09D3 J 
NCaenorhabditis elegans] 



0.36 



. (297991) hypothetical protein 

0.58 1 2276333 Rv0336 



0.36 



304 I AF003948 



Rhodococcus opacus 
chloromuconate 
[cycloisom erase 
transposase homolog 
cnes, complete cds 



305 I U10692 



uman MAGE-7 
antigen (MAGE7) 
pscudogene, complete 
cds 



0,58 


477072 i 


mucin 7 precursor, salivary - 
human 


0.28 


0.58 1 


3287858 | 


HOMEOBOX PROTEIN HOX- 
Cll J 


0.054 



WO 01/02568 



PCTAJSOO/18374 




( 



WO 01/02568 



PCT/US00/18374 



I SEQ 
P I ACCESSION 



Nearest Neighbor (BlasiN vs. Genbank) 



HSF3=HeaTsH5cr 
factor 2 (alternatively 
spliced, splice 
junction region} 
(mice, CBA/J, testis. 
Genomic, 120 nt, 
segment 2 of 3) 



Rat liver mRNA for 
Kan- 1 , complete cds 



Homo sapiens mRNAJ 
forKlAAQ449 
protein, partial cds 




DESCRIPTION 1 P VALUE 



Nearest Neighb or (BlastX vs. Non^ Redundant IWincV 
ACCESSION | DESCRIP TION jPVALUEl 



0.58 



4038594 



(AJ222798) tDETl protein 
tfLycopersicon esculcntmnl 
E3375J coded tor by C 



3e-06 



0.58 



1280135 



lelegans cDNA crn21e6; coded 
IforbyC. elegans cDNA 
|cm01e2; similar to melibiose 
learner protein 

Kthiomethylgalactoside perme 
III) 

plbhkMAL GROWTH 



le-08 



Homo sapiens mRNAj 
for Efel. complete 
cds 



0.58 



2833239 



FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U12535) epidermal j 
growth factor receptor kinase 
[substrate [Homo sapiens] 



3e-I3 



Saccharomyces 
cerevisiac IRE1 gene 
for putative protein 
kinase. 



0,58 



2943716 



(D45027) 25 kDa trypsin 
jinhibitor fHomo sapiens] 



2e-14 



S.ccrevisiae 
chromosome X 
reading frame ORF 
YJR035w 



0,58 



3880115 



(Z81130) T23G11.9 
l[Caenorhabditis elegans]^ 



S.cerevisiae DBF20 
gene, complete cds. 



0.58 



4106562 



Yeast PSS gene for 
phosphatidylserine 
synthetase 



0.57 



<NONE> 



Snail gene for ADP- 
ribosyl cyclase, 
complete cds 



0,57 



<NONE> 



S.cerevisiae 
chromosome XV 
reading frame ORF 
YOR096w 



0.57 



<NONE> 



omo sapiens 
(subclone 10_el0 
from Pi Hi 6) DNA 
sequence. 



0.57 



<NONE> 



0.57 



<NONE> | 



j(Z83819) dJl46H21.2 (similar 
to CYTOCHROME B-245 
HEAVY CHAIN) [Homo 
sapiens] 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



73 



9e-21 



3e-33 



<NONE> 



<NONE> I 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/USOO/18374 



pfcv| Nearest Neighbor (BlastN vs. GcnbankY 
|SEQ 

ID I ACCE SSION DESCRIPTION 



Nearest Neighbo r (BlasiX vs. Non-Redundant Pmi^ 



Wi>JtliHH*l« | l p JM.Miii 



P VALUE 1 ACCESSION 



DESCRIPTION 



327 1 D37887 



myc gene for c^Myc, 
complete cds 



J28 I AB0I4562 



0.57 



<N0NH> 



<NONE> 



P VALUE 



329 I 269651 



330 | P89285 



_331 I Z48951 



332 1 X95573 



Homo sapiens mRNAj 
forKIAA0662 
protein, partial c ds 
Human DN A 



0.57 



(M57576) Ig kappa chain [Mus 
197406 Jmusculus] 



sequence from 
cosmid L75B9, 
Huntington's Disease 
Region, chromosome | 
4pl6.3 



0.57 



Ichaperonin containing TCP- 1 
complex gamma chain - African 
clawed frog >gi|793886 
1079280 (X84990)Cctg 



Mesocricetus auratus 
DiRNA for imer-alpha| 
trypsin inhibitor 
heavy chain 1, 
complete cds 



<NONE> 



037 



i.cerevjsiae 
chromosome XVI 
cosmid 9723 



134132 



ryanodine receptor, 
[skeletal muscle 



0,57 



(AJ 130783) APC2 protein [Mus 
4210432 Jmusculus] 

TVkoglNTE 



A.thaliana mRNAfor| 
salt-tolerance zinc 
finger protein 



0.57 



1174828 



DECARBOXYLASE 2 
4.1.1.25) - parsley >gi| 169671 
(M96070) tyrosine 
decarboxylas e [Petrosclinum 



8.9 



8.9 



6,9 



5.3 



333 I U95094 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 



0.57 



465646 



pUOBAlJLk ABC 
TRANSPORTER ATP 
BINDING PROTEIN IN 
NTRA/RPON S'REGION 
(ORE!) Azorhizobium 
caulinodans >gi)3 11388 
1(X69959) ORF1 



_334 J AE001116 



Borrelia burgdorferi 
(section 2 of 70) of 
the complete genome 1 



0.57 



2314735 



335 | Z3429I 



norvegicus mRNA. 
for putative chloride 
channel. 



0.57 



1350832 



( AE000653) Na+/H+ antiporter 
(nhaA) [Helicobacter pylori 
26695] 



bNA-LUkLC IkD RNA 

POLYMERASE I SECOND 
LARGEST 3UBUNIT (RNA 
POLYMERASE I SUBTJNIT 2) 
chain RPA2 - Euplotes 
ociocarinatus (SGC9) 
>gi[5784Q7 octocarinatusl 



4.0 



4.0 



WO 01/02568 



PCTAJS00/18374 



SEC 
ID 


P. Nearest Neighbor (BlastN vs. Genbank) 

1 1 
ACCESSION! DESCRIPTION 1 p VALUE 


-[ Nearest Neighbor (BlastX vs. Non^Redundanr Proteins^ 1 

1 ACCESSION I DESCRIPTION |pvaitik| 


336 


D88255 


IHomo sapiens A30 
1 Vk germline gene, 
Jpartial cds 


1 0.57 


j 3875983 


U.01U0J) similar to Actinin-tyr 
jactin-binding domain containin 

InrrkfAinc \ (~* -l-j.. u- .. 1. ..L.i*.i: 

proteins i^aenornarxuus 
elecans] 


)C\ | 

g 1 

1 3.0 1 


337 


AF03726I 


Homo sapiens SH3- 
containing adaptor 
molecule- 1 mRNA, 
complete cds 


1 0.S7 


1397341 


(uciyjjj aim jar to mernnnt 

protein; coded for by C. elegani 
cDNA ykl84h5.3; coded for by 
C. elegans cDNA yki84h5.5; 

COded for bv C rlf»iranc rHMA 

ykl3d7.3; coded for by C. 
elegans cDNA yk!3d7.5; coded 
for by C. elegans cDNA 

virile 1 *\ • s»#% v »:h^nic a i 

yKoiCLD, co.,. >gi|349354I 
(AF057567) Idnesin-like protein 
ZEN-4a [Caenorhabditis 
elegans) 


e"[ — 1 

g 1 

[ 2.3 1 


338 


U26595 J 


Rattus norvegicus 
prostaglandin F2a 
receptor regulatory 
protein precursor, 
mRNA, complete cds 


0.57 


2773160 


V/vrujyqjo) neuronal tissue- 
enriched acidic protein [Homo 
sapiens) 


2,3 1 


339 


|R.norvcgicus mRNA 
for interlcukin 4 j 
_ X69903 receptor 


0,57 j 


2649193 


(AE001009) quinone-reactiveH 
Ni/Fe-hydrogenase B-type j 
cytochrome subunit (hydC) 
f Archaeoglobus fulgidus] j 


1.8 j 


340 


|( 
li 

Z74825 h 


>.cerevisiae 
:hromosome XV 1 
cading frame ORF 1 
r-OL083w 


0.57 


1458319 


(U64S46) F47D2.5 gene 
product [Caenorhabditis 
elegans] 1 


1.4 I 


341 


|i 

Ic 

AJ131469 h 


7 oot-and-mouth \ 
liscase virus O vpl j 
ene, strain O/A/58 


0.57 1 


1 

91206 ( 


3roline-rich protein - mouse 
fragment) musculus] 


1.4 I 


342 


r 

n 
Is 

AF011360 In 


ius musculus I 
sgulator of G-protelnj 
ignaJing 7 (RGS7) 1 
»RNA. complete cds 


0.57 


542514 s 


elsolin - American lobster 


0.80 1 


343 


r 

Ire 

Si 

AF0U360 |m 


lus musculus 1 
jgulator of G-protcin 
gnaling7(RGS7) j 
Jf^NA. complete cds | 


057 1 


g 

> 

1078946 a 


elsolin - American lobster 
gi|452313 gelsolin fHomams 
mcricanus] 


0.80 j 



US 



WO 01/02568 



PCT/US00/18374 



SEQ 



Nearest Neighbor (BlastN vs. Genbank) 



[ACCESSION I DESCRIPTION | P VALUE 



Nearest wcifrbpr (BlastX vs. Non-K^,^ 



ACCESSION 



DESCRIPTION 



345 I U81S23 



I Homo sapiens tnosine| 
[monophosphate 
dehydrogenase type II 
gene, complete cds ' 
Human endometrial 
(bleeding associated 
factor mRNA, 
[complete cds 



0J7 



0.57 



559526 



211499 



(X77466) 98.8kD polyprotein 
[Strawberry latent ringspot 
[virus] 



0.79 



[Tetrahymena 
Jthermophila 
Ipolyubiquitin (TTU3) j 
gene, complete cds, 
land RNA polymerase | 
III subunit 2 (RPB2) 
Igene, partial cds 



(KOI 702) HMW/LMW collagen 
[subunit precursor fGallus galluslj Q.79 



0.57 



2506493 



HYPOTHETICAL 100.5 KD 
PROTEIN IN IAP-CYSH 
'INTERGENIC REGION 
|>gi|8S2654 (U29579) alternate 
gene name ygcB; ORF_f888 
[Escherichia co lli >gifl7ftQl to 



0.60 



347 



C japonica mRNA foi 
Jlegumin (clone 
X95543 C jLeg31) 



348 



[Homo sapiens mRNA 
Y17282 for cytolccratin typ e II 
i-rog mRNA fragment 



349 



[for alpha-A2- 
X0Q716 cryscallin 

iKJeosieUa sp. 



350 I X53238 



bacteriophage Kll 
[gene 1 for RNA 
[polymerase 



0.57 



1709261 



M PROTEIN (1 60 KD 
NEUROFILAMENT 
PROTEIN) (NF-M) 
>SiH083I64|pir||S55395 
[neurofilament protein M 
(fragment) >gi|854353 



rabbit | 



0.57 



3044086 



(AF055904) unknown 
[[Myxococcus xanthus) 



0.46 



0.57 



3406654 



(AF079369) transcriptional 
repressor TUP1 [Dictyostelium 
Jdiscoideuml 



0.45 



0.20 



351 I X99012 



Rsapicns FUS gene, 
exon 12 



Human DNA 
sequence from PAC 
390N22 on 
_35j_ ALQQ87U chromosome Xp22. 2 



S74506 



SOX9 [human, fetal 
brain. Genomic, 1494 
nt. segment 3 of 3 



0.57 



1228093 g46913> P o»vtoid C5yi , l ha Se Q.16 



0i7 



t 243898 



(S78897) GOR=antigenic 
epitope [chimpanzees, Peptide, 
427 aa] fPan] 



0.090 



0.57 



1469545 



0.57 



1326350 



[(1153585) fibronectin attachment| 
protein [Mycobacterium avi um 1 
[(U58748) similar to potential 
transmembrane domains in S. 
jeerevisiae nulcear division 
iRfTl protein (SP:P38206) 



0.053 



0.017 



WO 01/02568 



PCT/US00/18374 



SEQ 



_Ncarcst Neighbor (BlastN vs. GenbanH 



JLl ACCESSION DESCRIPTION { P VALTJF 



354 



[Human mRNA for 
jgolgi antigen gcp372,, 
D25342 [complete cds | Q.57 




Mus musculo* mRN 
Ifor alphal,3 

[fucosyltransferase DC, 
355 | AB015426 comp lete cri* 

iXenopus mRNA for 
lAPEG protein, 
containing a highly 
(repetitive amino acid 
I X51394 seq u ence 



0.57 



0.57 



357 I AB007918 



Homo sapiens mRNAj 
Ifor KIAA0449 
Iprotcin, panial cds 



358 AB0QJ466 



Homo sapiens mRNA| 
jforEfsl, complete 
cds 



0,57 



Nearest Neighb or (Blasts. Non-Redund^Tgg^T 



(Ariu^/5)ceil surface pToTeln" 



IDTFA [Dictyostclium 
4063399 discoideuml 



2661842 



1929056 



(Y15732) DNA polymerase beta 



7e~H 



(Y12090) putative 3.4- 
|dihydroxy-2-butanone kinase 
f Lycop ersicon escuJentum ) f o^. 



J(Lycope 



2833239 



MiAiL GROWTH 
[FACTOR RECEPTOR 
IkINASE SUBSTRATE EPS8 
>gi|530823 (U12535) epidermal 
[growth factor receptor kinase 
jsubstrate fHomo sapiens] | 3 e .n 



359 I Y0076O 



Rabbit mRNA for 
jadult fast skeletal 
Itroponin-C 



0,57 



2943716 



(D45027) 25 IcDa trypsin 
inhibitor fHomo sapiens] 



2e-I4 



_360 J X95153 
361 1 X85967 



Hsapiens brca2 gene J 
lexon 3 > :: 

Jemb(A62778|A62778 
Sequence 19 from 
Patent WQ971911Q 



0.57 



2576348 



(AC002400) Glutamyl tRNA 
[synthetase fHomo sapiens! 



2e-28 



B.vulgaris mRNA for I 
I betavulgin 
[Mycoplasma 



0.57 



3419847 



( AC004982) similar to yeast 
[hypothetical protein ybk4; 
[similar to P38164 
|(Pg>:gS8646h [Homo sapiensf I 



0.56 



<NONE> 



<NONE> 



_363 I VQQI58 



Jgenitalium DNA 
[gyrase subunit B 
Jcomplete cds, DNA 
polymerase III beta 
Isubunit (dnaN) and 
Iseryl-tRNA 
jsynthetase (serS) 
genes, panial cds. 

Chloroplast Euglena 
[gracilis genes coding 
Ifor transfer RNAs 

specific for threonine, 
[glycine, methionine, 
[serine and elutamine. 



|^ONE> 



0.56 



<NONE> 



<NONE> 



<NONE> 



0.56 



<NONE> 



m 



<NONE> 



!<NONE> 



WO 01/02568 PCT/US00/18374 



SEC 
ID 


k| Nearest Neighbor (BlastN vs. 

n 1 

J ACCESSION DESCRIPTION 

1 It Insfnrliiim 


Genbank) 
P VALUE 


Nearest Nriah 

ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 1 
DESCRIPTION J p VALUEl 


364 


1 D88151 


perfringens DNA for 
p-aJamne.D-aJanine 
ligase, conical 
fragment-lyric 
enzyme 


0J6 


<NONE> 


_<NONE> 


J<NONE>| 


365 


I U67478 


Mcthanococcus 
Ijannaschii section 20 
[of 150 of the 
Icomplete genome 


0.56 


<NONE> 


<NONE> 


|<NONE>| 


JOO 


1 L23800 


jTachyglossus 
aculeatus beta-globin 
homolog (HBB) 
(gene, complete cds 


0.56 


• 

<NONE> 


<NONE> 


<NONE>| 


367 


AB01H29 


Homo sapiens mRNA 
for KIAA0557 
(protein. partiaJ cds 


0.56 


<NONE> 


<NONE> 


<NONE>| 


368 


' L77034 


|(subclone 10_el0 
from Pi HI 6) DNA 
sequence. 


0.56 


<NONE> 


<NONE> 


<NONE> J 


joy 


Z47202 


IC.albicans gene for 
TFIIIB (BRF1) 
subunit. 


0.56 


<NONE> 


<NONE> 1 




370 I 


U53868 


Clostridium 
acetobutylicurn 
mannitol-specifjc 
phosphotransferase 
system (PTS) system, 
mtlA. mtlR. mtlF. and 
mtlD genes, complete 
cds 


0.56 


<NONE> 


<NONE> 1 


<NONE>| 
<NONE>[ 


371 1 


1 

II 
II 

AF041259 < 


Homo sapiens breast 
cancer putative 
inscription factor 
[ZABC1) mRNA, 
:omplete cds 


0-56 


<NONE> 


<NONE> 


<NONE>| 


372 1 


1 Plasmodium 
falciparum variant- 
specific surface 
protein (var-7) 
L42636 ImRNA. complete cds. 


0.56 


2213557 C 


297052) hypothetical protein 1 


88 1 



WO 01/02568 PCT/US00/18374 



P^l N ^tNcighbor (BlasiN vs. Gen bankT 
I SEQ J ~ [ — 

P I ACCESSION 



373 I U961SO 



teLlighbor (BlastA vs. Non-Redundam ProteinsT 



DESCRIPTION |pvaUIf| 



DESCRIPTION 



J?4 I L76259 



J75 I AP045946 



toman protein 
tyrosine phosphatase 
(TEPl)mRNA, 
complete cds 



Homo sapiens PTS 
gene, complete cds 



Mus muscuius 
D16Jhul7 YAC 
98B3 acentric end, 
partial sequence 



376 I X97986 j 



M.musculus mRNA 
for dcsmocollin type 



377 



jM.muscuJus whey 
acidic protein (WAP) | 
_X79437 hem, exon 1 



378 



379 



iCaenorhabditis 
lelegans cosmid 
_AFQ36696 P15BI0 



380 I Z99IQ2 



JCaenorhabditis 
lelegans cosmid 
IB033 Incomplete 
I sequence 
([Caenorhabdiiis 
lelegans] 



0J6 



731016 



0.56 



2369863 



I g value! 



THIOREDOXIN REDUCTASE J 
thioredoxin reductase (NADPH)! 
[Coxieila burnetii! 1 



(Y12225)Spi-l/PU.l 
transcription factor 



2130017 



4038031 



hypothetical protein - common 
sunflower protein [Heltanthus 
annuus] 



(AC005936) hypothetical 
> r otcin [ Arabidopsis t 



0.56 



COMPONENT SPC42 yeast 
(Saccharomyces cerevisiae) 
>gi(486054 (228042) ORF 
YKL042w [Saccharomyces 
cerevisiae] >gi|666098 
(X71621) hypothetical 42,3 kD 
protein [Saccharomyces 
cerevisiae] 



iRat cardiac specific 
[sodium channel alpha-| 
jsubunit mRNA, 
M27902 [complete cds. j o 56 



585234 



0.56 



ENDOGLUCANASEG 
PRECURSOR 3.2.1.-) CelCCG 
precursor - Clostridium 
cellulolyticum ( 



0,56 



381 1 L27850 



lEquus caballus (clone 
JT131) T-cell receptor 1 
(PNA, V-re2ioo. 



0.56 



603664 



gp/ifc=enveiope protein "~ 
"endogenous provirus} host^cat 
lymphoid tissues. Peptide, 445 



H41U1J putative reverse 



transcriptase; ORF2; encodes aaj 
motifs conserved in reverse 
transcriptases ; most closely 
related reverse transcriptases arel 
those of non-LTR 1 
retrotransposons. The 3' 901 bp 
of this CDS are identical to the 
3 1 901 bp ... 



1079150 



8J 



67 



5.1 



3.9 



3.9 



3,6 



3.0 



(7<? 



WO 01/02568 



PCT/US00/18374 



SE< 
IjD 


^| Neares 
JaCCESSIO 


t Neighbor (BlasuN v S . 
S DESCRIPTION 


Genbank) 

1 D \/ A r TT1 


- Nearest Neighbor (BlastX vs. Non-Redundant Pmrein«l 

: ACCESSION 1 DESCRIPTION PvafiJ 


382 


1 X97986 


M.musculus mRNA 
for desmocollin type 
1 


J 056 


2497227 


PROTEIN IN PRE5-FET4 

INTERGENIC REGION 
_ >gi|1072409 (Z54141) unknot 


n 1.7 I 


383 


1 AF087455 


Dtdelphis vtrginiana 
u protein receptor 
kinase 2 mRNA, 
complete cds 


0.56 


1213453 


KU12964) contains ankyrin-likc 
Ire peats; similar to human 
Ides mop I akin repeal region 
MCaenorhabditis ejegansj 


1.3 1 


384 


J D80011 


Human mRNA for 
KIAA0189 gene, 
complete cds 


I 0.56 


226535 


protease [Hepatitis B virusl 


1.1 I 


385 


1 AJ002272 


Mus musculus mRN^ 
for HAP1-A protein, 
3* region 


1 0,56 


3327158 


(AB014572) KIAA0672 protein 
JfHomo sapiens] 


1 1 

1.0 j 


386, 


L39210 


Homo sapiens inosine 
monophosphate 
dehydrogenase type II 
gene, complete cds 


0.56 


628431 


Icoat protein - strawberry latent 
ringspot virus 


0,77 J 


387 


X02770 


Mouse Thy- 1.2 gene 
5* untranslated region 
and exon 1 


0.56 




(AB014516) KIAA0616 protein 
[Homo sapiens] 


059 j 


1 388 1 




Schizosaccharomyces 
pofnbe Wiskott- j 
Aldrich Syndrome 
protein homolog J 
(wspl+) gene, 
complete cds, and [ 
BTF3/beta-NAC 
ecne, parriaJ sequence 


0.56 


J 

88466 [ 


salivarv r>rolin^-rii-h 
phosphoprotein precursor PRH1 
(allele PEF) - human >gi|l 90484 
(K03203) prepro salivary 
prolmc-rich protein [Homo 
sapiens] >gi|190512 


0.35 j 


L389 I 


i 
I 

X56747 


Kat mRNA for fetal 
ntcstinal lactase- 
>hlorizin hydrolase 1 
>recursor. partial | 


0.56 


( 

2072742 r 


Z48674) chitinase homologue 
Sesbania rostrata] 1 


0 23 1 


390 


f 

• P 
Y 12072 s 


3arboreum mRNA 1 
or farnesyl 1 
yrophosphate J 
ynthase f 


0.56 


( 

296670 s 


X07882)Po protein [Homo 
apiens] 


0.20 1 


391 


P 
d 

a 

P 

{( 

[h 

S75756 G 


15=cyclin D- 1 
ependent kinases 4 
id 6- binding 
rotein/p 15 product 1 
ixon/intron 1 } 1 
uman. brain tumorsj 
cnomic, 753 nl] | 


0.56 


P 

S 

>, 

1082743 |pi 


rotein kinase (EC 2.7.1.37) 
PRK * human sapiens] 
ji|l09077I|prf)(20l9437A 
otein Tyr kinase I 


0.15 1 



1% 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


4- Nc.lrr^t 

_ ACCESSION 


iiCignDor (-DiasLTN vs. 

J DESCRIPTION 
Equus cabailus type 


3 en bank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant f 
[ DESCRIPTION 


"roieins) S 
_ P VALUE I 


392 


U62528 


II collagen mRNA. 
complete cds 


0.56 


461671 


[Segment 1 of 2J COLLAGEN 
_ ALPHA 1(0 CHAIN 


0.030 1 


393 


X96877 


Creinhardtii mRNA 
for unknown lumcnal 
polypeptide 


0.56 


3341678 


(AC003672) putative zinc finge 
_ protein [Arabidopsis thaiianal 


r I 

5e-09 J 


394 


S78788 


• 

cGATA-3 [chickens, 
liver. Genomic, 979 
nt< segment 4 of 4] 


0,56 


2661590 


(AL009196) 1* 

cvidence=predicted by content; 
l-method==genefindcr;084; 1- 
method__score=59.4i; 1- 
evidcnce_end; 2- 
t v iugiiic~ \jf cut vieo oy raatcn* 
match_accession=AA950019; 2- 
match_descriptioo=LD29959.5p 
rime LD Drosophila 
melanosas,.. 


2e-Il 1 


395 


AF006640 


Drosophila 
melanogaster Ste20- 
like protein kinase 
mRNA. complete cds 


0-56 


' 1109830 


\w"r Mjfj luucu ior oy 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
Cacnorhabditis elegans] 


6e-12 1 


396 


AF006640 


Drosophila 
melanogaster Stc20- 
like protein kinase 
mRNA, complete cds 


0.56 


U09830 


U4 1 334) coded for by 6. 

Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans) 


4e 13 I 


397 


AE000716 t 


Aquifex aeolicus 
section 48 of 109 of 
Jic complete genome 


0.56 


> 

t 

3688350 i 


d) 1189624 d '"' 

(novel PUTATIVE protein 
Similar to hypothetical proteins 
3. puniDc \-,jLj.rj. ana 
Regans C16A3.8) [Homo 
iapiens] 


3e 66 1 


398 


c 
r 

Z36079 ^ 


>,cerevisiae 
:hromosome II 
cading frame ORF 
fBR210w 


0,55 


<NONE> 


<NONE> 


<NONE>| 


399 


J 
f 

Y 17267 


Aus muse ill us mRNA 
or ubiquitin 
onjugatine enzvme 


0.55 


<NONE> 


<NONE> 


cNONE>l 


400 


E 

AC001461 s 


omo sapiens 
subclone 2_j5 from 
*ACHK)7)DNA 
equence 


055 


<NONE> 


<NONE> 


cNONE>| 



WO 01/02568 



PCT/US00/18374 



I SEQ 
m I ACCESSION 



Nearest Neighbor (BlastN vs. Genbank) 



401 1 AFO 19079 



402 I M90Q58 



DESCRIPTION 



Alouatta senicuius 



breast and ovarian 
susceptibility 
(BRCA1) gene, 
partial cds 



Human serglycin 
gene, exons 1,2, and 



_403 I AB013469 



404 | AJ0I1592 



.405 I Z15118 
406 1 Z489S1 



407 I U78726 



_408 I AG0Q1389 



409 1 M27640 



Mus musculus CLM2 
gene for cytohesin 2, 
complete and partial 
cds, alternative 
splicing 



P VALUE 



Nearest Neii>hbo r (BlastX vs. Non-f^ uridan t Proteins) " 
ACCESS ION I DESCRIPTION 



0.55 



<NONE> 



<NONE> 



|<NONE>| 



0.55 



<NONE> 



<NONE> 



|<NQNE>| 



Bacteriophage PI ban 
gene 



T.brucei kineioplast 
maxicircie variable 
region DNA 
xerevisiae 
chromosome XVI 
cosmid 9723 



0.55 



1729760 



(Z68152) chitinase [Gossypium 
jhirsuruml I g.6 



0.55 



2493689 



PHOTOS YSTEM II 10 KD 
PHOSPHOPROTEIN dcltoides]| 
>gi|2 143326|gnl|PID(e3 19090 
(Y1332S) lOkDa 
phosphoprotein [Popuius 
fdcltoides] 



0.55 



lomo sapiens mad" 
protein homolog 
Smad2 gene, 
promoter, exon la 
and exon I b 



Homo sapiens 
genomic DNA, 2Iq 
region, clone: 
9HllBm42 



Plasmodium vivax 
major blood stage 
surface antigen gene, 
partial cds. 



0,55 



297Q432_ 
4210432 



(AF049J32)NADH 
dehydrogenase subunit 5 
IfFlorometra serratissima 

(AJ 130783) APC2 protein [Mus | 
Imusculusl 



6.6 



6.5 



4.9 



0.55 



3319290 



(AF055994) thyroid hormone 
jrcceptof-associated protein 
jcomplex component TRAP220 

[Homo sapiens] 



0.55 



125684 



KRUEPPEL PROTEIN 
>gi|72899|pir||TWFF Krueppel 
(gap protein ^ fruit fly , 
|(Drosophila sp.) melanog aster] 
|>gi|224875|prf]|1202348A 
| Kru eppel eene 

1X-UMkEbPEST- 



3.8 



0.55 



549453 



CONTAINING 
TRANSPORTER transporter - 
human >gi|458255 (U05321) X-| 
linked PEST-containing 
Jtransporter [Homo sapiens! 



3.8 



WO 01/02568 



PCT/US00/18374 




WO 01/02568 



PCT/US00/18374 



a&*g| NearesfNejphhnr (BlastN vs. Gcnb^kT 



421 



Nearest N eighbor (BlistX vs. No1T5S^ ^^[ 



f Aquilcgia sp. 
Iphytochrome 
i ((PHYB/D) gene, 

A22l UQ8147 partial cds. 



0.55 



H.sapiens CpG DNA, 
I Iclone 12c8, reverse 

423 1 256586 read CP? 12c8 rtM 



055 



Mus muscuius . 
[glutamine:fructose-6- 
phosphate 
amidotnmsferase 
I j(GFAT) gene, 5* j 

^2iJ — U39442 re gion and partial cds I 0.55 
• Rat chymotrypsin B ' ~~ 

|(chyB) gene, 

0.55 



i gene, 
_425 j K02298 complete cd* 



I iM.muscuJus clustcrin 

42 6_) X84792 1 
ICapra 



. ipra acgagnis 
JSaancn and Weissc 
Edcl breeds DR beta- 
chain antigen binding 
I domain, MHC class 

427 | UQQ185 her 

H.sapicns CpG DNAJ 
clone I78ai2 r reverse! 
readcpgl78al2.rrla. 

Ofyctolagus 
cuntculus anion 
exchanger 3 brain 
isoform (AE3) 
mRNA. complete cds 



0.55 



_ 428 j Z54946 



_A29 I AFQ3165Q 



0.55 



0.55 



(AF005370) ribonucleotide. 
-i 33802 * kductase^lar pes ubunit | 



^46007)65^ [Rattus 
3320122 Inorvegicus^ j Q44 



__430 J M25579 



431 j Z48796 



Bovine adenylyl 
cyclase Type I 
mRNA. complete cds. 
H.sapiens Ski-W 
mRNA for he| tease j 0.55 



0,55 



(hypothetical protein - 
JgggjO Mycoplasma hvnrhink 



(YI7034) Bassoon [Mus 
3413810 Imuscufusj 



25 07136 PROTEIN SPAB 



SUBTILIN BIOSYNTHESIS 



807646 



(M 17294) unknown protein 
If Human herp esvirus 41 



1778210 



(U68412) fibrillar collagen 
(Arcnicola marina] 



(AE000997) conserved 
hypothetical protein 
[Archacoglobus fulpidus] 



(M 14708) DNA polymerase 
[Human cytomegalovirus] 



0.43 



0,33 



-^22 (D90905) hyp othet ical protein 



0.19 



0.065 



0,044 



0.023 



0.023 



m 



WO 01/02568 



PCT/US00/18374 



SEQ 



Nearest Neighbor (BlastN vs. r«.nh„.n 



accession! description |p VA ,rn:[ ,„^ n 



jCow dopamine 
transporter raRNA, 
M80234 putative cds. 



433 I U91616 



434 I D1Q91Q 



435 1 L22013 



436 | Z92653 



wmcpox virus 
compJete ORF5 
C20L-CIL>:: 
gb|I58297|I58297 
Sequence 14 from 
patent US 5651972 



0.55 



3876072 



Human 



0.54 



DESCRIPTION 



pvalueI 



(2W/oy) similar to tlongation 
factor Tu family (contains 
ATP/GTP binding P-loop); 
cDNA EST EMBL:D76223 
comes from this gene; cDNA 
EST yk478c5.5 comes from this 
gene [Caenorhabdiris cle pan^ 



(Z68314) similar to G-protein; 
cDNA EST EMBL:C 11959 
comes from this gene; cDNA 
EST EMBL:CI0341 comes 
from this gene; cDNA EST 
yk494e4.3 comes from this 
gene; cDNA EST yk448a8,5 
comes from this gene comes 
from this gene; cDNA EST 

EMBL.CI0341 comes from this 
gene; cDNA EST yk494e4.3 
comes from this gene; cDNA 
ESTyk44Sa8J comes from this 
jgene fCacnorhabditis elegans] 
>gi|3880364|gnl|PID|cI349948 
KZ83016) similar to G-protein; 
JcDNA EST EMBL.C11959 
comes from this gene; cDNA 
EST EMBL:C10341 comes 
jfrom this gene; cDNA EST 
Jyk494e4.3 comes from this 

gene; cDNA EST yk448a8.5 
Jcomes from this gene 

I fCacnorhabditis departs] 



4e-04 



(Z81505) Similarity to 
JMetanococcus hypothetical 
jproiein 0682 (TR:Q58095) 
JfCaenorhabditis elegansl 



0.54 



<NONE> 



<NONE> 



7e-06 



4c-42 



<NONE> 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs, Genbank) 


Nearest Neighbor (BlastX vs, Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















437 


K01992 


txou pnospnate- 
repressive 
periptasmic 
phosphate- binding 
protein (phoS), 
peripheral membrane 
proteins (pstC pstB 
and phoU) and 
integral membrane 
protein (pstA) genes, 
complete cds. 


0.54 


<NONE> 


<NONE> 


<NONE> 


438 


AE001415 


Plasmodium 
falciparum 
chromosome 2, 
section 52 of 73 of 
the comDletc 
sequence 


0-54 


<NONE> 


<NONE> 


<NONE> 


439 


AF064010 


Helianthus tuberosus 
lectin 2 mRNA, 

cnmnlete cd<5 


0.54 


<N0NE> 


<NONE> 


<NONE> 


440 


X12591 


E.eoli plasmid DNA 
for colicin E9 


0.54 


<NONE> 


<NONE> 


<NONE> 


441 


U73679 


Caenorhabditis 
elcgans YNKl-a 
mRNA, complete cds 


0-54 


<NONB> 


<NONE> 


<NONE> 


442 


Z93990 


Unidentified 
bacterium DNA for 
16S ribosomal RNA 


0.54 


<NONE> 


<NONE> 


<N0NE> 


443 


X85967 


B vulgaris mRNA for 
betavulgin 


0.54 


757836 


(237980) ORF12 [Escherichia 
colli 


8-3 


444 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


6,54 


151377 


(M80653) tetraheme 
[Pseudomonas stutzeri] 


6.2 


445 


X71S0O 


H.sapicns gene for 5S 
rRNA (640 bp) > :: 
emb!X7l80t|HS5SR6 
40B H.sapiens gene 
for 5S rRNA (640 bp) 


0.54 


33226S3 


(AE0O1216) T. pallidum 
predicted coding region TP0369 


27 


446 


U89241 


Human mibp gene, 
partial cds 


0.54 


4097465 


(U62253) 16kDa secretory 
protein [Sus scrofa) 


2,2 



WO 01/02568 



PCT/US00/18374 



a^l Nearest Neiphhor (BlastN vs. Genbank)" 



SEQ 

ED I ACCESSION 



447 I L16013 



448 I U60275 



449 I U36795 



DESCRIPTION I P VALUE 



. Nearest Neighbor fBlastX vs. W Rgjundant Pmt^^T 
ACCESSION | DESCRIPTION 



Rattus norvegicus Q- 
like gene sequence 



0.54 



451 I VQQ602 



452 I U608QQ 



453 1 x85969 



_454 1 YQ8265 



Capra hircus skeletal 
muscle voltage-gated 
chloride channel | 
gClC-1 mRNA, 

partia l cds _| 0.54 

Myxococcus xanthus 
rfbABC Q-antigen 
biosynthesis operon, 
rfl>A, rfbB, and rfbC 

genes, complete cds, I 0-54 
Drosophila 

melanogaster eyelid 
(eld) mRNA. 

complete cds | 0,54 



1781344 



3877232 



Genome of the 
bacteriophage fd 
(Inoviridae). 



2144110 



0.54 



Human semaphorin 
(CD100) mRNA, 
complete cds 



0,54 



coelicolor secD. 
secF & apt genes 



0,54 



3874972 



(Y10438)FK506polylceiid e 
{synthase 



1(281540) predicted using 
JCenefinder 



zinc finger protein RIZ « rat 
|> gi|949996 



(AL009197) hypothetical 
^661620 [ protein 

IvUWllkULlRA HIGH- 
SULFUR MATRIX PROTEIN 
(UHS KJERATEN) 
>gi|109n6|pir||A36686 ultra^ 
high-sulfur keratin - sheep 
>Si|l306 (X55294) ultra high- 
(sulphur keratin protein [Ovis 
125682 aric S | 



(Z99709) similar to Elongation 
factor Tu family (contains 
ATP/GTP binding P-Ioop); 
[cDNA EST EMBL:D76223 
[comes from this gene; cDNA 
EST ylc478c5.5 comes from this 
kene fCacnorhabditis elegansl 



H.sapicns mRNA for 
DAN26 protein, 
partial 



0.54 



3875131 



HZ70750) similar to vanadate 
(resistance protein 
Itransmembranous domains 
JfCaenorhabdius elegans] 



pvalue! 



(AJ0O5583) p75 protein 
3087760 [Crvpthecodi niurn cohniif I rj,95 



0.95 



0,74 



0.14 



0.11 



0.003 



7e-06 



5e-12 
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Nearest Neighbor (g ggN vs. Genbanki 



SEQ 

jgjACCESSIONl DESCRIPTION 



[Hyorom ames 
[platycephalus - 



455 



J56 { AF034597 



cytochrome b (cytb) 
I gene, mitochondria] 
gene encoding 
(mitochondrial 
U89613 protein, partial cds 

lHabrobracon hebetor 
(cytochrome oxidase 
In gene, partial cds; 
and iRNA-Asp, tRNA 
jHis, and iRNA-Lys 
(genes, complete 
(Sequence, 

(mitochondrial genes 
jfor mitochondrial 
[products 



Nearest Neighbor (Bjjjx vs. Nnn-P ^^ 
ACCESSION j DESC RIPTION (PVALUeI 



0,53 



<NONE> 



<NONE> 



457 



Yeast (S,cerevisiae) 
tau repetitive element 
K02653 land Cys-tRNA, 



0.53 



<NONE> 



,<NONE> 



458 



(Human mRNA for 
Jactin-binding protein 
JC53416 (fllamin) 



_459 I M55545 



jDrosophiia 
subobscura aJchohol 
dehydrogenase (Adh) 
gene, and alchohol 
[dehydrogenase (Adh- 
Idup) gene, complete 
cds's. 



0.53 



<NONE> 



<NONE> 



0.53 



bullous pemphigoid antigen 2 
2134839 human 



0.53 



hair keratin cysteine rich protein 
-2136865 I- sheep 



|<NONE>| 



Unone> 



6.2 



2.1 



WO 01/02568 



PCT/US00/18374 



mat 

SEQ 
ID 


Nearest 
ACCESSIOIN 


Neighbor (BlastN vs. ( 
I DESCRIPTION 


jcnbank) 
P VALUE 


Nearest Neiphl 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rote ins) 
P VALUE 


460 


U19362 


-MdUuuiMamiuin 
ihcrmoautotrophicum 
methylenc- 
tetrahydromethanopte 
rin dehydrogenase 
(mtd), 

imidazoleglycerol- 
phosphate 
dehydrogenase 
(hisB), and putative 
ferredoxin (fdxA) 
genes* complete cds, 
Orl9 gene, parti aJ cds, 
orfs ... 


0.53 


731969 


HYPOTHETICAL 91.6 KD 
PROTEIN IN HXT8-CRTI 
INTERGENIC REGION 
>gi|1078261|pir|jS50773 
probable membrane protein 
YJL2 12c- yeast 
(Saccharomyces cerevisiae) 
>gi|496950 (Z34098) ORF 
[Saccharomyces cerevisiae] 
>gi|10l5596 (249437) ORF 
YJL212c 


0.54 J 


461 


AB011527 


Rattus norvegicus 
irtRNA for MEGF1, 
complete cds 


0.53 


417037 


GERM CELL-LESS PROTEIN 
fruit fly (Drosophila 
melanogaster) >gi| 157490 
(M97933) germ cell-less protein 


3e-06 1 


462 


U64313 


Bacillus firmus MsyB 
gene, 5* upstream 
region and partial cds 


0,52 


<NONE> 


<NONE> 


<NONE>J 


463 


AF008590 


taenorhabditis 
elegans paraquat 
responsive protein 
(CePqM132) mRNA, 
complete cds 


0.52 


<NONE> 


<NONE> 


<NONE>l 


464 


L 10245 


Mus saxicola 
spermidine/spermine 
N 1 -acety 1 transferase 
[S5AT) gene, 
complete cds. 


0.52 


<NONE> 


<NONE> 


<NONE>| 


465 


i 
( 

j 

AF027173 c 


\rabidopsis thaliana 
:ellulose synthase 
ratalytic subunit (Ath- 
\) mRNA, complete 
•ds 


0.52 


{ 

! 
[ 

124263 s 


LNJWLiN-UKi: GROWTH 
-ACTOR IB PRECURSOR 
IGF-IB) (SOMATOMEDIN C) 
>gi|69361|pir||IGHUlB insulin- 
ike growth factor IB precursor - 
mman prepropeptide [Homo 
apiens] 


7.7 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


PVALUE 


ACCESSION 


DESCRIPTION 


PVALUE 






CaenorRafcdltis 










466 


AL021066 


elegans cosmid 
H31B20, complete 
sequence 
[Caenorhabditis 
elegans] 


0.52 


2589162 


(D88451) aldehyde oxidase [Zea 
mays] 


6.0 


467 


AF038588 


Porphyfa linearis 18S 
ribosomal RNA gene, 
3' partial sequence 


0.52 


1055055 


(U39HDU) coded for by C. 
elegans cDNA yk37gl.5; coded 
for by C. elegans cDNA 
yk5c9 J; coded for by C. 
elegans cDNA ykla9.5; 
alternatively spliced form of 
F52C9.8b 


4,6 


468 


AE001125 


Borrelia burgdorferi 
(section U of 70) of 
the complete genome 


0.52 


4115827 


(AB021287) polyprotein 
[Hepatitis G vims] 


2.0 


469 


AF006640 


Drosophila 
melanogasier Ste20- 
like protein kinase 
mRNA* complete cds 


0.52 


1109830 


(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2^AD54 family, 
[Caenorhabditis elegans] 


0.002 


470 


U90177 \ 


Aplysia califomica 
ubiquitin earboxyl- 
terminal hydrolase 
(Ap-uch) mRNA, 
complete cds 


OJi 


<NONE> 


<NONE> 


<NONE> 


471 


Z28304 


S.cerevisiae 
chromosome XI 
reading frame ORF 
YKR079C 


0.51 


<NONE> 


<NONE> 


<NONE> 


472 


Z92837 


Caenorhabditis 
elegans cosmid 
KUJti. complete 
sequence 
[Caenorhabditis 
elegans] 


0.51 


123506 


HYDROPHOBIC SEED 
PROTEIN (HPS) 


7.6 


473 


D13803 


Mouse mRNA for 
RecA-like protein 
MmRadS I, complete 
cds 


0.51 


3327228 


(AB014607) KIAA0707 protein 
Homo sapiens] ] 


4.5 


474 


X07187 


Peahsp2l mRNA 


0,51 


3328678 


(AE001299) hypothetical 
protein [Chlamydia trachomatis] 


4.4 
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SEQ 



Nearest Neighbor (BlastN vs. Gcnbank) 



_.fl> I ACCESSION DESCRIPTION 



GaasEssiaaii 



.481 I AJ001388 



_482 I M86626 



483 | U76523 



434 | AF031663 



,485 1 U32729 



486 |aF067198 



cDNA for complete 
mRNA 



P occultum 23S 
ribosomal RNA, 
partial cds, 



P VALUE 



0.50 



0J0 



Sambucus nigra lectin) 
precursor mRNA, 
complete cds 



0.50 



Mus musculus striatin] 
mRNA, complete cd$ qjq 



Haemophilus 
nfiuenzae Rd section | 
44 of 163 of the 



:omplete genome 
Oictyostelium 



0.50 



487 1 M23442 



J88 I UI6367 



,489 | AFOQ1000 



490 I 218920 



_49I 1 D86983 



_492 I AF064030 



discoideum clone 

10Tdd-3 and RED 
repetitive elements. 
Partial sequence 



Human intcrlcukin 4 
(IL*4) gene, complete I 



0.50 



cds. 



0,49 



Nearest Neighbor (BlastX vs. Non-R^.n^, n^rf 
ACCE SSION | DESCRIPTION 



<NONE> 



<NONE> 



1722856 



179521 



<NQNE> 



PVALUEI 



<NONE> 
'crikOMosdME A&ElVfBLY 
PROTEIN XCAP-E African 
clawed frog >gi(563814 
(U13674) XCAP-E [Xcnopus 
laevisl 



<NONE> 



(M63730)BPAG2[Homo 
sapiens) 



3875699 



2494740 



Caenorhabditis 
elegans POU 
homeobox protein 
CEH-18(ceh-18) 
mRNA, complete cds 



Lycopersicon 
esculentum 
>olygalacturonase 1 



Yersinia 
enterocoliiica wbb 
itne cluster 
iuman mRNA for 
KIAA0230 gene, 
partial cds 



Helianthus tuberosus 
lectin 2 mRNA. 
complete cds 



0.47 



0.45 



041 



0.35 



0.33 



<NONE> 



(Z92829)F10A3.15 
[Caenorhabditis elegans] 



, HVPUlHtTlCAJL24.ijed 

PROTEIN IN GBD 5TUEGION 
(ORF4) >gi|2120954|pirj|I39562 
ORF4 - Alcaligenes eutrophus 
>gi(695274 TL36817) ORF4 



<NONE> 



3.2 



3.2 



0,65 



<NONE> 



3786409 



<NONE> 



<NONE> 



AF098499) contains similarity 
to Saccharomyces cerevisiae 
MAF1 protein (GB:UI9492) 
[Caenorhabditis elegans] 



<NONE> 



<NONE> 



206712 



<NONE> 



(M64793) salivary proline-rich 
protein fRattus norvepicus] 



_<NQNE> 



0.008 



<NONE> 



8.9 



<NONE> 



<NONE> 



4e~05 



<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 

| SEQ~ 

_i p I accession! description 1 pv_alue 

Viireosciila sp. outer' 



493 I AF067083 



J94 1 Y1552Q 



495 | U33475 



496 | D88356 



497 I U67603 



498 I U82386 



499 I 249625 



membrane protein 
homolog gene, 
complete cds; Trp 
repressor binding 
protein gene, partial 
cds; and unknown 
genes 



0.33 



Papio hamadryas 
anubis gene encoding I 
fertilin alpha- II 



0.29 



Alestes sp. 
ependymin mRNA, 
partial cds 



vlouse DNA for 8- 
oxodGTPase, 
complete cds^ 



0.28 



Methanoeoccus 
jannaschii section I45| 
of 150 of the 
complete genome 



0.22 



Nearest Neighbor (BlastX vs. Non-Redundant FWi^T 



0.22 



501 I M24543 



Malums cyaneus 
microsatellite McyU2 1 0.22 
S.cerevisiae 
chromosome X 
reading frame QRF 
YJR125c | o,21 



Uictyostelium " 
discoideum AX2 
protein tyrosine 
kinase gene, complete| 
cds. 



0.21 



Human prostate- 
specific antigen (PA) 
gene, complete cds. 



0,21 



ACCESSION 



DESCRIPTION 



401553 



HYPOTHETICAL 24.5 KD 
PROTEIN IN NADB-SRMB 
jlNTERGENIC REGION 



_2408049 (2 9916 4) hypothetical 



Kffi } 64 * hypothetical protein 



8.3 



RECEPTOR NUCLEAR 
TRANSLOCATOR 
JHOMOLOG (DARNT) 
(TANGO PROTEIN) 
[transcription factor (Drosophila 
39 1 3078 melanopaster] 



3.1 



<NONE> 



<NONE> 



2209261 



(U51222) p40 [Streptomyces 
lhalstediil 



992631 



<NONE> 



(U29I31) Mg-chelatasesubunit 
IfSynechocvstis so. 



<NONE> 



1.4 



<NONE> 



8.3 



<NONE> 



<NONE> 



<NONE> 



2764859 



(X97918) gene 12.1 
[[Bacteriophage SPP1] 



<NONE> 



6.0 
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SEQ 

E> I ACCESSION 



502 I X87618 



503 I X71591 



504 1 X57808 



505 I U95098 



DESCRIPTION PVALUF 



Nearest Neighbor (BlastX VSt NoivR.dundant Proteins) 



ACCESSION 



B.taurus mRNA for 
thrombospondin 
[partial) 2162 bp j 0.21 
B.taurus 
microsatellite 

sequence INRAQ48 | 0.21 



iuman gcnnline 
immunoglobulin 
lambda light chain 
gene 



Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 



0,21 



DESCRIPTION 



I P value! 



2146000 



506 | U84216 



507 I U31463 



508 1 X51508 



509 



510 



AF086476 



AF077006 



Mycobacterium 
fortuitum plasmid 
pJAZ38 replication 
protein Rep (rep) 
gene, complete cds 



Rattus norvegicus 
nonmuscle myosin 
heavy chain-A 
mRNA, complete cds. 



0,21 



Rabbit mRNA for 
amino peptidase N 
(panial) 



Homo sapiens full 
length insert cDNA 
clone ZD88F12 
Helicobacter pylori 
plasmid pHPM!86, 
complete sequence 



0.21 



0.20 



^Mycobacterium tuberculosis 
J tuberculosis] 

J>gi| l694863|gnl|PID|e283373 
KZ83018) hypothetical protein 
Rv2968c [Mycobacterium 
jtuberculosisl 



3.5 



1354453 



2119158 



procollagen type V alpha 2 - 
Js*^£r£« 



2497139 



PROTEIN EN ABF2-CHL12 
INTERGENIC REGION 
>gi|1078003|pir(|S52835 
hypothetical protein YMR075w \ 

Jyeast (Saccharomyces 

jcerevisiae) >gi|763022 

J(Z43952) unknown 

{[Saccharo myces cerevisiae] 



2.7 



2.0 



2499087 



TOBF 

GLUCO$E;GLYCOPROTEIN 
I GLUCOS YLTRANSFER ASE 
PRECURSOR (DUGT) 
glucosyltransferase - fruit fly 
[(DrosopmJa sp.) 
glucosyltransferase precursor 
[Drosophila melanogaster 1 | 0.003 



(281 130) predicted using 
.3880111 iGenefinder 



630864 



LRR47 protein " f'ruit tly 
KDrosophila melanogaster) 
J>gi|4 15947 (X75760) LRR47 
If Drosophila melanogaster] 



0.002 



<NONE> 



<N0NE> 



<NONE> 



<NONE^ 



<NONE> 



<NONE> 
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SEQ 
ID 



Nearest Neighbor fBlastN vs. Genbanir \ 



ACCESSION 



512 



513 



514 



515 



X75036 



D90875 



516 



517 



518 



519 



268343 



X62486 



DESCRIPTION | p VALUE 



acstivum 



mitochondrial nad7 
gene for NADH 
dehydrogenase 

subunit 7 I q.20 



E.cou genomic DNAJ 
Kohara clone 

^22(55^55,8 mini | 0 20 
Jaenornabditis 
elegans cosmid 
F59B8. complete 
sequence 
Caenorhabditis 

Regans] | q 2 q 



M.musculus V alpha 
MJ gene 5-re|zion 



0.20 



AJFQ4Q651 



JLJ10470 



D83778 



S43579 



520 



U07357 



Caenorhabditis 
elegans cosmid 
WQ4H10 

Pseudomonas 
fluorcscens PHA 
dcpolymerase (phaZ) | 
:ene, complete cds. 
Human mRNA for 
KlAA0194gcne, 
partial cds 



0.20 



0.20 



0.20 



c-scr=pp60c-src, 
sdr=src downstream 
region 



VIus in use ul us Balb/c | 
brain-specific kinase 
(Bsk) mRNA, 
complete cds, 



0.20 



0.20 



Nearest Neighbor (BiastX vs. Non^kedundant IWin« i 
ACCESSION | DESCRIPTION [*"*Mt| 



<NONE> 



<NONE> 



<NONE> 



<NQNE> 



<NONE> 



<NONE> 



<NONE> 



1170683 



3721862 



126363 



4159887 



<NON E> 

KINASE ALPHA 
REGULATORY CHAIN, 

[SKELETAL MUSCLE 

JlSOFORM 

((PHOSPHOR YLASE KINASE 
ALPHA M SUBUNIT) 
>gi|2135923|pir||I38111 
phosphorylasc kinase (EC 
2.7.1.38) -Jnirnan >gi[791043 



(AB016024) Pfj2 [Plasmodium 
falciparum 
LAMIN1N ALPHA- 1 CHAIN 
PRECURSOR precursor - 
human 



<NONE>| 



<NONE> 



<ttONE>| 



( AC004908) similar to" 

fibosomal protein L23a; similar 
to P29316(PlD:gl32848) 
[Homo sapiens] 



206712 



(M64793) salivary proline-rich 
protein fRattus norvegicus] 



7.4 



1.9 



0.65 



0,52 



0,51 



«1s 



WO 01/02568 
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ffl Nearest 
ACCESSIONS 


Neighbor (BlastN vs. ( 
\ DESCRIPTION 


Denbank) 
P VALUE 


j Nearest Neigh 
ACCESSION 


&or (Blast A vs. Non-Redundant Proteins) "1 
DESCRIPTION PVALlJ 


521 


AF034460 


renicuuum tnorrui 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene 
and internal 
transcribed spacer 2. 
complete sequence; 
and 28S ribosomal 
RNA gene, partial 
sequence 


0.20 


114136 


AMINO-ACID 
ACETYLTRANSFERASE 
Pseudomonas aeruginosa 
>gi|I51036 (M38358)N- 
acetylglutamate synthase 
Pseudomonas aeruginosa] 


039 J 


522 


U95098 


Xcnopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.20 


2842674 


TUV UUiVlAilvTL?UA l\ 

ASSOCIATING FACTOR i (B- 
CELL-SPECIFIC 
COACTIVATOR OBF-1) (OCT 
BINDING FACTOR I) (BOB- 
l)(OCA-B) Bobl.B^ccll- 
Specific - mouse 

loiojDosji /yojz 
mBobI=B-celI specific 
transcriptional coactivator line 
J558L, Peptide, 256 aaj 
>gi| I 37y2 (U43788) Oct j 
binding factor 1 [Mus muscuiusll 


0,073 


523 


X95971 


S.lividans groEL2 
gene 


0.20 


3925277 


(ALU3J64i) similar to 
Uncharacterized protein family 
UPF0034. Double-stranded 
ivpim Dinatng mom, cjljina to i I 
yk4S9b3.5 comes from this I 
gene; cDNA EST yk439g7.5 
comes from this gene 
Caenorhabditis clcgans] ! 


4e-19 | 


524 


L41502 


Dvis aries 
vasopressin VI 
receptor (V1R) gene, 
complete cds 


0.19 


<NONE> 


<NONE> I 


<NONE>| 


525 


t 

s 

J03885 


cv. jjncuinunjnc 

>xalacetate 
lecarboxylase alpha 
iubunit gene, 
complete cds. 


0.19 


<NONE> 


<NONE> 


<NONE> 


526 


s 
c 

AE001451 c 


Helicobacter pylori, 
train J99 section 12 
>f 132 of the 
Dinplete genome 


0.19 


<NONE> 


<NONE> |, 


*NONE>| 
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/ . 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighb or (BlastX vs. Non-Redundant Protein.^ 



527 



528 



529 



ACCESSION! DESCRIPTION 1 P VALUE 1 ACCESSION 



[Pedicularis ~~~ 
vcrticillata 
chloroplast DNA, 
intergenic region 
between trnT(UGU) 
D88084 and trnL(UAA)5 , exQ n| 
Methanococcus 



0.19 



<NONE> 



Ijannaschii section 14l| 
[of 150 of the 
U67599 complete genome 



0.19 



<N0NE> 



Human beta-spectrin 
(SPTB) mRNA, 
J05500 [complete cds. 



0.19 



<NONE> 



DESCRIPTION 



<NONE> 



<NONE> 



<NONE> 



P VALUEI 



<NONE> 



<NONE>| 



<NONE>| 



530 



IM.mycoides ftsY 
gene homologue and 
(gene encoding 
_Y 10 1 37 hypothetical protein 



0.19 



<NONE> 



531 



Arabidopsis thai i ana 
cellulose synthase 
catalytic subunit (Ath~| 
B) mRNA, complete 
AF027174 Icds 



019 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



532 



[Mouse thymic 
stromal cell mRNA 
forTLSF-beta, 
D438Q5 Icompletecds 



0.19 



<NONE> 



<NONE> 



<N0NE> 



533 



Tetrahymena 
thcrmophila 
macronuclear gene 
jencoding ribosomal 
AJ012585 protein L3,exons 1-2 1 



534 



Brassica napus 5- 
enolpyruvyishikin 
3-phosphaie synthase | 
X5I475 gene 



0.19 



<NONE> 



0.19 



<NONE> 



535 



Sambucus nigra 
hevein-like protein 
AF074386 [mRNA. complete cds I 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



536 



S.cerevisiae 
Ichromosome X 
reading frame ORF 
Z49625 |YJRl25c 



0.19 



<NONE> 



Hi 



<NONE> 



<NONE> 
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Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neichbor (BlastX vs. Non-Redundant Proteins) 


SEQ 

IXJ 


ACCESSION 


DESCRIPTION ; 


P VALUE 


ACCESSION 


DESCRIPTION 


_P VALUE 






H.sapiens pilot 








537 


X63741 


mRNA 


0.19 


<NONE> 


<NONE> 


<N0NE> 


538 


YH255 


O.iadpcs mRNA for 
anncxm max4 


0.19 


<NONE> 


<NONE> 


<NONE> 


539 


L63537 


Oncorhynchus mykiss 
(clone Jb~ 10) beta-2 
microglobulin (B2m) 
mRNA. complete cds. 


0.19 


<NONE> 


<NONE> 


<NONE> 


540 


X70903 


N.tobacum T92 gene 
for auxin-binding 
protein 


0.19 


<NONE> 


<NONE> 


<NONE> 


541 


U61958 


Cacnorhabditis 
elegans cosmid 
C25A8 


0.19 


<NONE> 


<NONE> 


<NONE> 


542 


U33959 


Macaca fasciculahs 
fenilin beta mRNA, 
complete cds 


0.19 


<N0NE> 


<NONE> 


<NONE> 


543 


Z49835 


H.sapiens mRNA for 
protein disulfide 
isomerasc 


* 0,19 


2113940 


(Z95556) hypothetical protein 
Rv2507 


9,4 


544 


AF035458 


Spinacia olcracca 
heat shock 70 protein 
protein, complete cds 


0.19 


267293 


PROBABLE £4 ^RbtfelK 
papillomavirus (type 1) 
>gi|610l5 (X62844) E4 gene 
product [Pygmy chimpanzee 
papillomavirus type 1] 


9.4 


545 


U23441 


Tetrahymena 
thermophila B 
internal deletion 
sequence. 


0,19 


3877185 


(Z66563) F46C3.2 
[Cacnorhabditis elegans] 


93 


546 


U53921 


Pneumocystis carinit 
major surface 
glycoprotein 


0.19 


3548901 


(AF052502) DA26 homolog 
[Epiphyas postvittana 
n ucleopolyhedrovirus] 


9.3 


547 


LU002 


Rat ankyrin binding 
glycoprotein- 1 related 
mRNA sequence. 


019 


3337352 


(AC004481) putative chromatin 
structural protein Supt5hp 




548 


U67560 


Methanococcus 
jannaschii section 102 
of 150 of the 
complete genome 


0.19 


3183689 


(Y 13585) serotonin receptor 4 
[Cavia porcellus] 


8.7 


549 


U18424 


Mus musculus 
bacteria binding 
macrophage recepror 
MARCO mRNA. 
complete cds. 


0.19 


3659853 


(AF089083) complement 
component CI qB like protein 


7.1 
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